Please wait a minute...
Frontiers of Optoelectronics

ISSN 2095-2759

ISSN 2095-2767(Online)

CN 10-1029/TN

Postal Subscription Code 80-976

Front. Optoelectron.    2019, Vol. 12 Issue (3) : 324-331    https://doi.org/10.1007/s12200-019-0853-1
RESEARCH ARTICLE
High accuracy object detection via bounding box regression network
Lipeng SUN1, Shihua ZHAO1, Gang LI2, Binbing LIU3()
1. State Grid Hunan Electric Power Corporation Limited Research Institute, Changsha 410007, China
2. State Grid Hunan Electric Power Corporation Limited, Changsha 410007, China
3. School of Optical and Electronics Information, Huazhong University of Science and Technology, Wuhan 430074, China
 Download: PDF(1565 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

As one of the primary computer vision problems, object detection aims to find and locate semantic objects in digital images. Different with object classification, which only recognizes an object to a certain class, object detection also needs to extract accurate locations of objects. In the state-of-the-art object detection algorithms, bounding box regression plays a critical role in order to achieve high localization accuracy. Almost all the popular deep learning based object detection algorithms have utilized bounding box regression for fine tuning of object locations. However, while bounding box regression is widely used, there is few study focused on the underlying rationale, performance dependencies, and performance evaluation. In this paper, we proposed a dedicated deep neural network for bounding box regression, and presented several methods to improve its performance. Some ad hoc experiments are conducted to prove the effectiveness of the network. Also, we apply the network as an auxiliary module to the faster R-CNN algorithm and test them on some real-world images. Experiment results show certain performance improvements on detection accuracy in term of mean IOU.

Keywords deep learning      object detection      bounding box regression      IOU distribution     
Corresponding Author(s): Binbing LIU   
Online First Date: 18 June 2019    Issue Date: 16 September 2019
 Cite this article:   
Lipeng SUN,Shihua ZHAO,Gang LI, et al. High accuracy object detection via bounding box regression network[J]. Front. Optoelectron., 2019, 12(3): 324-331.
 URL:  
https://academic.hep.com.cn/foe/EN/10.1007/s12200-019-0853-1
https://academic.hep.com.cn/foe/EN/Y2019/V12/I3/324
Fig.1  Object detection: find and locate objects in images
Fig.2  Schematic diagram of bounding box regression
Fig.3  Flow chart of the proposed method
Fig.4  Comparison of adjusted distribution with original distribution of IOU. (a) IOU distribution on uniformly distributed BTC; (b) IOU distribution on Laplacian distributed BTC
Fig.5  Convergence curve of RMSE
baseline data_aug iou_uniform
proposal IOU 0.7982 0.7982 0.7982
predicted IOU 0.8312 0.9076 0.9354
Tab.1  Improvement of IOU in three algorithm configurations
Fig.6  Detection results of BBR network on some real-world images
1 K Mikolajczyk, C Schmid. An affine invariant interest point detector. Vancouver, Canada. In: Proceedings of European Conference on Computer Vision. Beilin: Springer, 2002, 128–142
2 N Dalal, B. Triggs Histograms of oriented gradients for human detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. San Diego: IEEE, 2005, 886–893
3 D G Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91–110
https://doi.org/10.1023/B:VISI.0000029664.99615.94
4 O Russakovsky, J Deng, H Su, J Krause, S Satheesh, S Ma, Z Huang, A Karpathy, A Khosla, M S Bernstein, A C Berg, F Li F. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 2015, 115(3): 211–252
https://doi.org/10.1007/s11263-015-0816-y
5 K He, X Zhang, S Ren, J. Sun Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2015, 770–778
6 J Han, D Zhang, G Cheng, N Liu, D Xu. Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Processing Magazine, 2018, 35(1): 84–100
https://doi.org/10.1109/MSP.2017.2749125
7 H Jiang, M M Cheng, S J Li, A Borji, J Wang. Joint salient object detection and existence prediction. Frontiers of Computer Science, 2018,
https://doi.org/10.1007/s11704-017-6613-8
8 R B Girshick, J Donahue, T Darrell, J. Malik Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014, 580–587
9 S Ren, K He, R Girshick, J Sun. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149
https://doi.org/10.1109/TPAMI.2016.2577031 pmid: 27295650
10 W Liu, D Anguelov, D Erhan, C Szegedy, S E Reed, C Fu, A C. Berg SSD: single shot multibox detector. In: Proceedings of European Conference on Computer Vision. Berlin: Springer, 2016, 21–37
11 J Redmon, S K Divvala, R B Girshick, A. Farhadi You only look once: unified, real-time object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016, 779–788
12 T Y Lin, P Dollar, R Girshick, K He, B Hariharan, S Belongie. Feature pyramid networks for object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017, 936–944
13 K He, G Gkioxari, P Dollar, R. Girshick Mask R-CNN. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2018, doi:10.1109/TPAMI.2018.2844175
14 M Everingham, S M A Eslami, L V Gool, C K I Williams, J Winn, A Zisserman. The pascal visual object classes challenge: A Retrospective. International Journal of Computer Vision, 2015, 111(1): 98–136
https://doi.org/10.1007/s11263-014-0733-5
15 J R R Uijlings, K E A van de Sande, T Gevers, A W M Smeulders. Selective search for object recognition. International Journal of Computer Vision, 2013, 104(2): 154–171
https://doi.org/10.1007/s11263-013-0620-5
16 P F Felzenszwalb, R B Girshick, D McAllester, D Ramanan. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627–1645
https://doi.org/10.1109/TPAMI.2009.167 pmid: 20634557
17 H M Park, D Y Cho, K J Yoon. Greedy refinement of object proposals via boundary-aligned minimum bounding box search. IET Computer Vision, 2018, 12(3): 357–363
https://doi.org/10.1049/iet-cvi.2017.0208
18 M Everingham, L Van Gool, C K I Williams, J Winn, A Zisserman. The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 2010, 88(2): 303–338
https://doi.org/10.1007/s11263-009-0275-4
19 A Geiger, P Lenz, R. Urtasun Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012, 3354–3361
20 K Simonyan, A Zisserman. Very deep convolutional networks for large-scale image recognition. 2014, arXiv:1409.1556
21 Z Chen, T Zhang, C Ouyang. End-to-end airplane detection using transfer learning in remote sensing images. Remote Sensing, 2018, 10(1): 139
https://doi.org/10.3390/rs10010139
22 S J Pan, Q Yang. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10): 1345–1359
https://doi.org/10.1109/TKDE.2009.191
23 J Deng, W Dong, R Socher, L J Li, K Li, F F Li. ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009, 248–255
[1] Shihua ZHAO, Lipeng SUN, Gang LI, Yun LIU, Binbing LIU. A CCD based machine vision system for real-time text detection[J]. Front. Optoelectron., 2020, 13(4): 418-424.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed