Robust feature learning for online discriminative tracking without large-scale pre-training

doi:10.1007/s11704-017-6281-8

Front. Comput. Sci.

2018, Vol. 12

Issue (6) : 1160-1172 https://doi.org/10.1007/s11704-017-6281-8

RESEARCH ARTICLE

Robust feature learning for online discriminative tracking without large-scale pre-training

Jun ZHANG, Bineng ZHONG(

), Pengfei WANG, Cheng WANG, Jixiang DU

Department of Computer Science and Technology, Huaqiao University, Xiamen 361021, China

Download: PDF(838 KB)
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

Owing to the inherent lack of training data in visual tracking, recent work in deep learning-based trackers has focused on learning a generic representation offline from large-scale training data and transferring the pre-trained feature representation to a tracking task. Offline pre-training is time-consuming, and the learned generic representation may be either less discriminative for tracking specific objects or overfitted to typical tracking datasets. In this paper, we propose an online discriminative tracking method based on robust feature learning without large-scale pre-training. Specifically, we first design a PCA filter bank-based convolutional neural network (CNN) architecture to learn robust features online with a few positive and negative samples in the high-dimensional feature space. Then, we use a simple softthresholding method to produce sparse features that are more robust to target appearance variations.Moreover, we increase the reliability of our tracker using edge information generated from edge box proposals during the process of visual tracking. Finally, effective visual tracking results are achieved by systematically combining the tracking information and edge box-based scores in a particle filtering framework. Extensive results on the widely used online tracking benchmark (OTB- 50) with 50 videos validate the robustness and effectiveness of the proposed tracker without large-scale pre-training.

Keywords visual tracking convolutional neural networks PCA Edge Box

Corresponding Author(s): Bineng ZHONG

Just Accepted Date: 29 September 2017 Online First Date: 06 July 2018 Issue Date: 04 December 2018

Cite this article:

Jun ZHANG,Bineng ZHONG,Pengfei WANG, et al. Robust feature learning for online discriminative tracking without large-scale pre-training[J]. Front. Comput. Sci., 2018, 12(6): 1160-1172.

URL:

https://academic.hep.com.cn/fcs/EN/10.1007/s11704-017-6281-8
https://academic.hep.com.cn/fcs/EN/Y2018/V12/I6/1160

1	Comaniciu D, Ramesh V, Meer P. Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(5): 564–577 https://doi.org/10.1109/TPAMI.2003.1195991
2	Danelljan M, Khan F S, Felsberg M, Weijer J V D. Adaptive color attributes for real-time visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2014, 1090–1097 https://doi.org/10.1109/CVPR.2014.143
3	Ross D A, Lim J, Lin R S, Yang M H. Incremental learning for robust visual tracking. International Journal of Computer Vision, 2008, 77(1–3): 125–141 https://doi.org/10.1007/s11263-007-0075-7
4	Wang Q, Chen F, Xu W L, Yang M H. Object tracking via partial least squares analysis. IEEE Transactions on Image Processing, 2012, 21(10): 4454–4465 https://doi.org/10.1109/TIP.2012.2205700
5	Viola P, Jones M J. Robust real-time face detection. International Journal of Computer Vision, 2004, 57(2): 137–154 https://doi.org/10.1023/B:VISI.0000013087.49260.fb
6	Grabner H, Bischof H. On-line boosting and vision. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2006, 260–267 https://doi.org/10.1109/CVPR.2006.215
7	Hare S, Saffari A, Torr P. Struck: structured output tracking with kernels. IEEE International Conference on Computer Vision and Pattern Recognition. 2011 https://doi.org/10.1109/ICCV.2011.6126251
8	Yao R, Shi Q F, Shen C H, Zhang Y N, Hengel A V D. Part-based visual tracking with online latent structural learning. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2013, 2363–2370 https://doi.org/10.1109/CVPR.2013.306
9	Ahonen T, Hadid A, Pietikainen M. Face description with local binary patterns: application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(12): 2037–2041 https://doi.org/10.1109/TPAMI.2006.244
10	Takala V, Pietikainen M. Multi-object tracking using color, texture and motion. In: Proceedings of the IEEE Conference on Computer Vission and Pattern Recognition. 2007, 1–7 https://doi.org/10.1109/CVPR.2007.383506
11	Yang F, Lu H, Zhang W, Yang G. Visual tracking via bag of features. IEEE Transactions on Image Processing, 2012, 6(2): 115–128 https://doi.org/10.1049/iet-ipr.2010.0127
12	Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2005, 886–893 https://doi.org/10.1109/CVPR.2005.177
13	Godec M, Roth P M, Bischof H. Hough-based tracking of non-rigid objects. Computer Vision and Image Understanding, 2011, 117(10): 1245–1256 https://doi.org/10.1016/j.cviu.2012.11.005
14	Lu Y, Wu T F, Zhu S C. Online object tracking, learning and parsing with and-or graphs. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2014, 3462–3469 https://doi.org/10.1109/CVPR.2014.443
15	Grabner H, Matas J, Gool L V, Cattin P. Tracking the invisible: learning where the object might be. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2010 https://doi.org/10.1109/CVPR.2010.5539819
16	Fan J L, Shen X H, Wu Y. Scribble tracker: a matting-based approach for robust tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(8): 1633–1634 https://doi.org/10.1109/TPAMI.2011.257
17	Porikli F, Tuzel O, Meer P. Covariance tracking using model update based on lie algebra. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2006, 728–735 https://doi.org/10.1109/CVPR.2006.94
18	Wu Y, Cheng J, Wang J, Lu H, Wang J, Ling H, Blasch E, Bai L. Real-time probabilistic covariance tracking with efficient model update. IEEE Transactions on Image Processing, 2012, 21(5): 2824–2837 https://doi.org/10.1109/TIP.2011.2182521
19	Li X, Dick A, Shen C H, Hengel A V D, Wang H Z. Incremental learning of 3D-DCT compact representations for robust visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(4): 863–881 https://doi.org/10.1109/TPAMI.2012.166
20	Isard M, Blake A. CONDENSATION—conditional density propagation for visual tracking. International Journal of Computer Vision, 1998, 29(1): 5–28 https://doi.org/10.1023/A:1008078328650
21	Wang S, Lu H, Yang F, Yang M H. Superpixel tracking. In: Proceedings of the IEEE International Conference on Computer Vision. 2011, 1323–1330
22	Smeulders A W M, Chu D M, Cucchiara R, Calderara S, Dehghan A, Shah M. Visual tracking: an experimental survey. IEEE Transactions on Pattern Analysis andMachine Intelligence, 2014, 36(7): 1442–1468
23	Li X, Hu W, Shen C, Zhang Z, Dick A, van den Hengel A. A survey of appearance models in visual object tracking. ACM Transactions on Intelligent Systems and Technology, 2013, 4(4): 1–42 https://doi.org/10.1145/2508037.2508039
24	Collins R T, Liu Y, Leordeanu M. Online selection of discriminative tracking features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(10): 1631–1643 https://doi.org/10.1109/TPAMI.2005.205
25	Mei X, Ling H. Robust visual tracking using L1 minimization. In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 1436–1443
26	Bao C, Wu Y, Ling H, Ji H. Real time robust L1 tracker using accelerated proximal gradient approach. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2012, 1830–1837
27	Zhang K H, Zhang L, Yang M H. Real-time compressive tracking. In: Proceedings of European Conference on Compute Vision. 2012, 864–877 https://doi.org/10.1007/978-3-642-33712-3_62
28	Zhang T, Ghanem B, Liu S, Ahuja N. Low-rank sparse learning for robust visual tracking. In: Proceedings of European Conference on Compute Vision. 2012, 470–484 https://doi.org/10.1007/978-3-642-33783-3_34
29	Jia X, Lu H C, Yang M H. Visual tracking via adaptive structural local sparse appearance model. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2012, 1822–1829
30	Zhang Z, Wong K H. Pyramid-based visual tracking using sparsity represented mean transform. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2014, 1226–1233 https://doi.org/10.1109/CVPR.2014.160
31	Zhong B N, Yao H X, Chen S, Ji R R, Chin T J, Wang H Z. Visual tracking via weakly supervised learning from multiple imperfect oracles. Pattern Recognition, 2014, 47(3): 1395–1410 https://doi.org/10.1016/j.patcog.2013.10.002
32	Hong Z, Chen Z, Wang C, Mei X, Prokhorov D, Tao D. Multistore tracker (muster): a cognitive psychology inspired approach to object tracking. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2015, 749–758
33	Bai Y, Tang M. Robust tracking via weakly supervised ranking SVM. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2012, 1854–1861
34	Zuo W M, Wu X H, Lin L, Zhang L, Yang M H. Learning support correlation filters for visual tracking. 2016, arXiv preprint arXiv:1601.06032
35	Kalal Z, Mikolajczyk K, Matas J. Tracking-learning-detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(7): 1409–1422 https://doi.org/10.1109/TPAMI.2011.239
36	Babenko B, Yang M, Belongie S. Robust object tracking with online multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8): 1619–1632 https://doi.org/10.1109/TPAMI.2010.226
37	Santner J, Leistner C, Saffari A, Pock T, Bischof H. PROST: parallel robust online simple tracking. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2010, 723–730 https://doi.org/10.1109/CVPR.2010.5540145
38	Gall J, Yao A, Van L, Lempitsky V. Hough forests for object detection, tracking, and action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(11): 2188–2202 https://doi.org/10.1109/TPAMI.2011.70
39	Zhang L, Maaten L V D. Preserving structure in model-free tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(4): 756–769 https://doi.org/10.1109/TPAMI.2013.221
40	Duffner S, Garcia C. Pixeltrack: a fast adaptive algorithm for tracking non-rigid objects. International Conference on Computer Vision. 2013, 2480–2487 https://doi.org/10.1109/ICCV.2013.308
41	Cehovin L, Kristan M, Leonardis A. Robust visual tracking using an adaptive coupled-layer visual model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(4): 941–953 https://doi.org/10.1109/TPAMI.2012.145
42	Henriques J F, Caseiro R, Martins P, Batista J. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596 https://doi.org/10.1109/TPAMI.2014.2345390
43	Chen Z, Hong Z B, Tao D C. An experimental survey on correlation filter-based tracking. 2015, arXiv preprint arXiv:1509.05520
44	Liang P P, Liao C Y, Mei X, Ling H B. Adaptive objectness for object tracking. 2015, arXiv preprint arXiv:1501.00909
45	Cheng M M, Zhang Z M, Lin W Y, Torr P. BING: binarized normed gradients for objectness estimation at 300fps. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2014, 3286–3293 https://doi.org/10.1109/CVPR.2014.414
46	Hua Y, Alahari K, Schmid C. Online object tracking with proposal selection. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3092–3100. https://doi.org/10.1109/ICCV.2015.354
47	Zhu G, Porikli F, Li H D. Tracking randomly moving objects on Edge Box proposals. 2015, arXiv preprint arXiv:1507.08085
48	Gan Y, Liu J, Dong J Y, Zhong G Q. A PCA-based convolutional network. 2015, arXiv preprint arXiv:1505.03703
49	Guo Y W, Chen Y, Tang F, Li A, Luo W T, Liu M M. Object tracking using learned feature manifolds. Computer Vision and Image Understanding, 2014, 118: 128–139 https://doi.org/10.1016/j.cviu.2013.09.007
50	Fan J L, Xu W, Wu Y, Gong Y H. Human tracking using convolutional neural networks. TEEE Transactions on Neural Networks, 2010, 21(10): 1610–1623 https://doi.org/10.1109/TNN.2010.2066286
51	Wang N Y, Yeung D Y. Learning a deep compact image representation for visual tracking. In: Proceedings of Neural Information Processing Systems Conference. 2013, 809–817
52	Wang L, Liu T, Wang G, Chan K L, Yang Q. Video tracking using learned hierarchical features. IEEE Transactions on Image Processing, 2015, 24(4): 1424–1435 https://doi.org/10.1109/TIP.2015.2403231
53	Li H X, Li Y, Porikli F. Deeptrack: learning discriminative feature representations by convolutional neural networks for visual tracking. In: Proceedings of British Machine Vision Conference. 2014 https://doi.org/10.5244/C.28.56
54	Wang L J, Ouyang W L, Wang X G, Lu H C. Visual tracking with fully convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3119–3127 https://doi.org/10.1109/ICCV.2015.357
55	Hong S, You T, Kwak S, Han B. Online tracking by learning discriminative saliency map with convolutional neural network. In: Proceedings of International Conference on Machine Learning. 2015, 597–606
56	Ma C, Huang J B, Yang X K, Yang M H. Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3074–3082 https://doi.org/10.1109/ICCV.2015.352
57	Nam H S, Han B Y. Learning multi-domain convolutional neural networks for visual tracking. 2015, arXiv preprint arXiv:1510.07945
58	Elad M, Figueiredo M A, Ma Y. On the role of sparse and redundant representations in image processing. Proceedings of the IEEE, 2010, 98(6): 972–982 https://doi.org/10.1109/JPROC.2009.2037655
59	Wu Y, Lim J W, Yang M H. Online object tracking: a benchmark. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2013, 2411–2418 https://doi.org/10.1109/CVPR.2013.312
60	Yilmaz A, Javed O. Shah M. Object tracking: a survey. ACM Computing Surveys, 2006, 38(4): 13. https://doi.org/10.1145/1177352.1177355
61	Dollár P, Zitnick C T. Structured forests for fast edge detection. In: Proceedings of the IEEE International Conference on Computer Vision. 2013, 1841–1848 https://doi.org/10.1109/ICCV.2013.231
62	Zitnick C L, Dollár P. Edge boxes: locating object proposals from edges. In: Proceedings of European Conference on Compute Vision. 2014, 391–405 https://doi.org/10.1007/978-3-319-10602-1_26
63	Zhang J M, Ma S G, Sclaroff S. MEEM: robust tracking via multiple experts using entropy minimization. In: Proceedings of European Conference on Compute Vision. 2014 https://doi.org/10.1007/978-3-319-10599-4_13
64	Henriques J F, Caseiro R, Martins P, Batista J. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596 https://doi.org/10.1109/TPAMI.2014.2345390
65	Gao J, Ling H, Hu W, Xing J. Transfer learning based visual tracking with gaussian processes regression. In: Proceedings of European Conference on Compute Vision. 2014 https://doi.org/10.1007/978-3-319-10578-9_13

[1]

Download

[1]	Huiying ZHANG, Yu ZHANG, Xin GENG. Practical age estimation using deep label distribution learning[J]. Front. Comput. Sci., 2021, 15(3): 153318-.
[2]	Zhenyang SU, Jing LI, Jun CHANG, Bo DU, Yafu XIAO. Real-time visual tracking using complementary kernel support correlation filters[J]. Front. Comput. Sci., 2020, 14(2): 417-429.
[3]	Jiaqing FAN, Huihui SONG, Kaihua ZHANG, Qingshan LIU, Fei YAN, Wei LIAN. Real-time manifold regularized context-aware correlation tracking[J]. Front. Comput. Sci., 2020, 14(2): 334-348.
[4]	Anna ZHU, Seiichi UCHIDA. Scene word recognition from pieces to whole[J]. Front. Comput. Sci., 2019, 13(2): 292-301.
[5]	Haijun WANG, Hongjuan GE. Visual tracking using discriminative representation with ℓ₂ regularization[J]. Front. Comput. Sci., 2019, 13(1): 199-211.
[6]	Qianjun ZHANG, Lei ZHANG. Convolutional adaptive denoising autoencoders for hierarchical feature extraction[J]. Front. Comput. Sci., 2018, 12(6): 1140-1148.
[7]	Lili HUANG, Jiefeng PENG, Ruimao ZHANG, Guanbin LI, Liang LIN. Learning deep representations for semantic image parsing: a comprehensive overview[J]. Front. Comput. Sci., 2018, 12(5): 840-857.
[8]	Nan REN,Junping DU,Suguo ZHU,Linghui LI,Dan FAN,JangMyung LEE. Robust visual tracking based on scale invariance and deep learning[J]. Front. Comput. Sci., 2017, 11(2): 230-242.
[9]	Feifei ZHANG,Yongbin YU,Qirong MAO,Jianping GOU,Yongzhao ZHAN. Pose-robust feature learning for facial expression recognition[J]. Front. Comput. Sci., 2016, 10(5): 832-844.
[10]	Yi ZHENG,Qi LIU,Enhong CHEN,Yong GE,J. Leon ZHAO. Exploiting multi-channels deep convolutional neural networks for multivariate time series classification[J]. Front. Comput. Sci., 2016, 10(1): 96-112.
[11]	Jianhua JIA, Bingxiang LIU, Licheng JIAO. Soft spectral clustering ensemble applied to image segmentation[J]. Front Comput Sci Chin, 2011, 5(1): 66-78.

Viewed

Full text

Abstract

Cited

Shared

Discussed