Please wait a minute...
Frontiers of Optoelectronics

ISSN 2095-2759

ISSN 2095-2767(Online)

CN 10-1029/TN

Postal Subscription Code 80-976

Front. Optoelectron.    2020, Vol. 13 Issue (4) : 418-424    https://doi.org/10.1007/s12200-019-0854-0
RESEARCH ARTICLE
A CCD based machine vision system for real-time text detection
Shihua ZHAO1, Lipeng SUN1, Gang LI2, Yun LIU1, Binbing LIU3()
1. State Grid Hunan Electric Power Corporation Limited Research Institute, Changsha 410007, China
2. State Grid Hunan Electric Power Corporation Limited, Changsha 410007, China
3. School of Optical and Electronics Information, Huazhong University of Science and Technology, Wuhan 430074, China
 Download: PDF(1725 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Text detection and recognition is a hot topic in computer vision, which is considered to be the further development of the traditional optical character recognition (OCR) technology. With the rapid development of machine vision system and the wide application of deep learning algorithms, text recognition has achieved excellent performance. In contrast, detecting text block from complex natural scenes is still a challenging task. At present, many advanced natural scene text detection algorithms have been proposed, but most of them run slow due to the complexity of the detection pipeline and cannot be applied to industrial scenes. In this paper, we proposed a CCD based machine vision system for real-time text detection in invoice images. In this system, we applied optimizations from several aspects including the optical system, the hardware architecture, and the deep learning algorithm to improve the speed performance of the machine vision system. The experimental data confirms that the optimization methods can significantly improve the running speed of the machine vision system and make it meeting the real-time text detection requirements in industrial scenarios.

Keywords machine vision      text detection      optical character recognition (OCR)      deep learning     
Corresponding Author(s): Binbing LIU   
Online First Date: 07 August 2019    Issue Date: 31 December 2020
 Cite this article:   
Shihua ZHAO,Lipeng SUN,Gang LI, et al. A CCD based machine vision system for real-time text detection[J]. Front. Optoelectron., 2020, 13(4): 418-424.
 URL:  
https://academic.hep.com.cn/foe/EN/10.1007/s12200-019-0854-0
https://academic.hep.com.cn/foe/EN/Y2020/V13/I4/418
Fig.1  A typical CCD based machine vision system for real-time text detection and recognition
Fig.2  Natural scene text detection: find and locate the text blocks
Fig.3  Detection results of the CTPN algorithm on two invoice images
Fig.4  Time consumption of the CTPN algorithm in various hardware environments
stage time
CNN 3.4 s
LSTM +0.2 s
proposal +1.4 s
detection +0.3 s
total 5.3 s
Tab.1  Time consumption of each stage of the CTPN algorithm
Fig.5  Time consumption of the CTPN algorithm with various network architectures. (a) Test results on CPU (Intel i7); (b) test results on GPU (NVidia GTX1080)
method speed improvement accuracy degradation
hyperparameter optimization 10% yes
SIMD acceleration 50% no
GPU acceleration 2000% no
network model optimization 80% yes
optical system improvement 250% no
Tab.2  Contribution of various optimization methods to algorithm acceleration
1 A Contes, B Carpenter, C Case, S Satheesh, B Suresh, T Wang, J D Wu, A Y Ng. Text detection and character recognition in scene images with unsupervised feature learning. In: Proceedings of International Conference on Document Analysis and Recognition. Beijing: IEEE, 2011, 440–445
2 Q Ye, D Doermann. Text detection and recognition in imagery: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(7): 1480–1500
https://doi.org/10.1109/TPAMI.2014.2366765 pmid: 26352454
3 X Zhang, X Gao, C Tian. Text detection in natural scene images based on color prior guided MSER. Neurocomputing, 2018, 307: 61–71
4 R Smith. An overview of the tesseract OCR engine. In: Proceedings of International Conference on Document Analysis and Recognition. Parana: IEEE, 2007, 629–633
5 B Epshtein, E Ofek, Y. Wexler Detecting text in natural scenes with stroke width transform. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2010, 2963–2970
6 M Jaderberg, K Simonyan, A Vedaldi, A Zisserman. Reading text in the wild with convolutional neural networks. International Journal of Computer Vision, 2016, 116(1): 1–20
https://doi.org/10.1007/s11263-015-0823-z
7 S Ren, K He, R Girshick, J Sun. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149
https://doi.org/10.1109/TPAMI.2016.2577031 pmid: 27295650
8 W Liu, D Anguelov, D Erhan, C Szegedy, S E Reed, C Fu, A C. Berg SSD: single shot MultiBox detector. In: Proceedings of European Conference on Computer Vision. Berlin: Springer, 2016, 21–37
9 J Redmon, S K Divvala, R B Girshick, A. Farhadi You only look once: unified, real-time object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016, 779–788
10 Z Tian, W Huang, T He, P He, Y Qiao. Detecting text in natural image with connectionist text proposal network. In: Proceedings of European Conference on Computer Vision. Berlin: Springer, 2016, 56–72
11 J Ma, W Shao, H Ye, L Wang, H Wang, Y Zheng, X. Xue Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia, 2018, 20(11): 3111–3122
12 Y Liu, L Jin. Deep matching prior network: toward tighter multi-oriented text detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017, 3454–3461
13 B Shi, X Bai, S. Belongie Detecting oriented text in natural images by linking segments. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 3482–3490
14 M Liao, B Shi, X Bai, X Wang, W. Liu TextBoxes: A fast text detector with a single deep neural network. 2016, arXiv:1611.06779
15 M Liao, B Shi, X Bai. TextBoxes++: a single-shot oriented scene text detector. IEEE Transactions on Image Processing, 2018, 27(8): 3676–3690
https://doi.org/10.1109/TIP.2018.2825107 pmid: 29993831
16 Y Dai, Z Huang, Y Gao, Y Xu, K Chen, J Guo, W. Qiu Fused text segmentation networks for multi-oriented scene text detection. 2018, arXiv:1709.03272
17 H Hu, C Zhang, Y Luo, Y Wang, J Han, E Ding. WordSup: exploiting word annotations for character based text detection. In: Proceedings of IEEE International Conference on Computer Vision. Venice: IEEE, 2017, 4950–4959
18 J Deng, W Dong, R Socher, L J Li, K Li, F F Li. ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009, 248–255
19 D Karatzas, F Shafait, S Uchida, M Iwamura, L G I Bigorda, S R Mestre, J Mas, D F Mota, J A Almazàn, L P D L Heras. ICDAR 2013 robust reading competition. In: Proceedings of International Conference on Document Analysis and Recognition. Washington, DC: IEEE, 2013, 1484–1493
20 K Simonyan, A Zisserman. Very deep convolutional networks for large-scale image recognition. 2014, arXiv:1409.1556
21 A G Howard, M Zhu, B Chen, D Kalenichenko, W Wang, T Weyand, M Andreetto, H. Adam MobileNets: efficient convolutional neural networks for mobile vision applications. 2017, arXiv:1704.04861
22 X Zhang, X Zhou, M Lin, J. Sun ShuffleNet: an extremely efficient convolutional neural network for mobile devices. 2017, arXiv:1707.01083v2
23 F N Iandola, S Han, M W Moskewicz, K Ashraf, W J Dally, K. Keutzer SqueezeNet: AlexNet-level accuracy with 50 ´ fewer parameters and<0.5 MB model size. 2016, arXiv:1602.07360v4
[1] Lipeng SUN, Shihua ZHAO, Gang LI, Binbing LIU. High accuracy object detection via bounding box regression network[J]. Front. Optoelectron., 2019, 12(3): 324-331.
[2] Yu HAN, Yubin WU, Danhua CAO, Peng YUN. Defect detection on button surfaces with the weighted least-squares model[J]. Front. Optoelectron., 2017, 10(2): 151-159.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed