|
|
Robust feature learning for online discriminative tracking without large-scale pre-training |
Jun ZHANG, Bineng ZHONG( ), Pengfei WANG, Cheng WANG, Jixiang DU |
Department of Computer Science and Technology, Huaqiao University, Xiamen 361021, China |
|
|
Abstract Owing to the inherent lack of training data in visual tracking, recent work in deep learning-based trackers has focused on learning a generic representation offline from large-scale training data and transferring the pre-trained feature representation to a tracking task. Offline pre-training is time-consuming, and the learned generic representation may be either less discriminative for tracking specific objects or overfitted to typical tracking datasets. In this paper, we propose an online discriminative tracking method based on robust feature learning without large-scale pre-training. Specifically, we first design a PCA filter bank-based convolutional neural network (CNN) architecture to learn robust features online with a few positive and negative samples in the high-dimensional feature space. Then, we use a simple softthresholding method to produce sparse features that are more robust to target appearance variations.Moreover, we increase the reliability of our tracker using edge information generated from edge box proposals during the process of visual tracking. Finally, effective visual tracking results are achieved by systematically combining the tracking information and edge box-based scores in a particle filtering framework. Extensive results on the widely used online tracking benchmark (OTB- 50) with 50 videos validate the robustness and effectiveness of the proposed tracker without large-scale pre-training.
|
Keywords
visual tracking
convolutional neural networks
PCA
Edge Box
|
Corresponding Author(s):
Bineng ZHONG
|
Just Accepted Date: 29 September 2017
Online First Date: 06 July 2018
Issue Date: 04 December 2018
|
|
1 |
Comaniciu D, Ramesh V, Meer P. Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(5): 564–577
https://doi.org/10.1109/TPAMI.2003.1195991
|
2 |
Danelljan M, Khan F S, Felsberg M, Weijer J V D. Adaptive color attributes for real-time visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2014, 1090–1097
https://doi.org/10.1109/CVPR.2014.143
|
3 |
Ross D A, Lim J, Lin R S, Yang M H. Incremental learning for robust visual tracking. International Journal of Computer Vision, 2008, 77(1–3): 125–141
https://doi.org/10.1007/s11263-007-0075-7
|
4 |
Wang Q, Chen F, Xu W L, Yang M H. Object tracking via partial least squares analysis. IEEE Transactions on Image Processing, 2012, 21(10): 4454–4465
https://doi.org/10.1109/TIP.2012.2205700
|
5 |
Viola P, Jones M J. Robust real-time face detection. International Journal of Computer Vision, 2004, 57(2): 137–154
https://doi.org/10.1023/B:VISI.0000013087.49260.fb
|
6 |
Grabner H, Bischof H. On-line boosting and vision. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2006, 260–267
https://doi.org/10.1109/CVPR.2006.215
|
7 |
Hare S, Saffari A, Torr P. Struck: structured output tracking with kernels. IEEE International Conference on Computer Vision and Pattern Recognition. 2011
https://doi.org/10.1109/ICCV.2011.6126251
|
8 |
Yao R, Shi Q F, Shen C H, Zhang Y N, Hengel A V D. Part-based visual tracking with online latent structural learning. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2013, 2363–2370
https://doi.org/10.1109/CVPR.2013.306
|
9 |
Ahonen T, Hadid A, Pietikainen M. Face description with local binary patterns: application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(12): 2037–2041
https://doi.org/10.1109/TPAMI.2006.244
|
10 |
Takala V, Pietikainen M. Multi-object tracking using color, texture and motion. In: Proceedings of the IEEE Conference on Computer Vission and Pattern Recognition. 2007, 1–7
https://doi.org/10.1109/CVPR.2007.383506
|
11 |
Yang F, Lu H, Zhang W, Yang G. Visual tracking via bag of features. IEEE Transactions on Image Processing, 2012, 6(2): 115–128
https://doi.org/10.1049/iet-ipr.2010.0127
|
12 |
Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2005, 886–893
https://doi.org/10.1109/CVPR.2005.177
|
13 |
Godec M, Roth P M, Bischof H. Hough-based tracking of non-rigid objects. Computer Vision and Image Understanding, 2011, 117(10): 1245–1256
https://doi.org/10.1016/j.cviu.2012.11.005
|
14 |
Lu Y, Wu T F, Zhu S C. Online object tracking, learning and parsing with and-or graphs. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2014, 3462–3469
https://doi.org/10.1109/CVPR.2014.443
|
15 |
Grabner H, Matas J, Gool L V, Cattin P. Tracking the invisible: learning where the object might be. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2010
https://doi.org/10.1109/CVPR.2010.5539819
|
16 |
Fan J L, Shen X H, Wu Y. Scribble tracker: a matting-based approach for robust tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(8): 1633–1634
https://doi.org/10.1109/TPAMI.2011.257
|
17 |
Porikli F, Tuzel O, Meer P. Covariance tracking using model update based on lie algebra. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2006, 728–735
https://doi.org/10.1109/CVPR.2006.94
|
18 |
Wu Y, Cheng J, Wang J, Lu H, Wang J, Ling H, Blasch E, Bai L. Real-time probabilistic covariance tracking with efficient model update. IEEE Transactions on Image Processing, 2012, 21(5): 2824–2837
https://doi.org/10.1109/TIP.2011.2182521
|
19 |
Li X, Dick A, Shen C H, Hengel A V D, Wang H Z. Incremental learning of 3D-DCT compact representations for robust visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(4): 863–881
https://doi.org/10.1109/TPAMI.2012.166
|
20 |
Isard M, Blake A. CONDENSATION—conditional density propagation for visual tracking. International Journal of Computer Vision, 1998, 29(1): 5–28
https://doi.org/10.1023/A:1008078328650
|
21 |
Wang S, Lu H, Yang F, Yang M H. Superpixel tracking. In: Proceedings of the IEEE International Conference on Computer Vision. 2011, 1323–1330
|
22 |
Smeulders A W M, Chu D M, Cucchiara R, Calderara S, Dehghan A, Shah M. Visual tracking: an experimental survey. IEEE Transactions on Pattern Analysis andMachine Intelligence, 2014, 36(7): 1442–1468
|
23 |
Li X, Hu W, Shen C, Zhang Z, Dick A, van den Hengel A. A survey of appearance models in visual object tracking. ACM Transactions on Intelligent Systems and Technology, 2013, 4(4): 1–42
https://doi.org/10.1145/2508037.2508039
|
24 |
Collins R T, Liu Y, Leordeanu M. Online selection of discriminative tracking features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(10): 1631–1643
https://doi.org/10.1109/TPAMI.2005.205
|
25 |
Mei X, Ling H. Robust visual tracking using L1 minimization. In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 1436–1443
|
26 |
Bao C, Wu Y, Ling H, Ji H. Real time robust L1 tracker using accelerated proximal gradient approach. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2012, 1830–1837
|
27 |
Zhang K H, Zhang L, Yang M H. Real-time compressive tracking. In: Proceedings of European Conference on Compute Vision. 2012, 864–877
https://doi.org/10.1007/978-3-642-33712-3_62
|
28 |
Zhang T, Ghanem B, Liu S, Ahuja N. Low-rank sparse learning for robust visual tracking. In: Proceedings of European Conference on Compute Vision. 2012, 470–484
https://doi.org/10.1007/978-3-642-33783-3_34
|
29 |
Jia X, Lu H C, Yang M H. Visual tracking via adaptive structural local sparse appearance model. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2012, 1822–1829
|
30 |
Zhang Z, Wong K H. Pyramid-based visual tracking using sparsity represented mean transform. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2014, 1226–1233
https://doi.org/10.1109/CVPR.2014.160
|
31 |
Zhong B N, Yao H X, Chen S, Ji R R, Chin T J, Wang H Z. Visual tracking via weakly supervised learning from multiple imperfect oracles. Pattern Recognition, 2014, 47(3): 1395–1410
https://doi.org/10.1016/j.patcog.2013.10.002
|
32 |
Hong Z, Chen Z, Wang C, Mei X, Prokhorov D, Tao D. Multistore tracker (muster): a cognitive psychology inspired approach to object tracking. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2015, 749–758
|
33 |
Bai Y, Tang M. Robust tracking via weakly supervised ranking SVM. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2012, 1854–1861
|
34 |
Zuo W M, Wu X H, Lin L, Zhang L, Yang M H. Learning support correlation filters for visual tracking. 2016, arXiv preprint arXiv:1601.06032
|
35 |
Kalal Z, Mikolajczyk K, Matas J. Tracking-learning-detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(7): 1409–1422
https://doi.org/10.1109/TPAMI.2011.239
|
36 |
Babenko B, Yang M, Belongie S. Robust object tracking with online multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8): 1619–1632
https://doi.org/10.1109/TPAMI.2010.226
|
37 |
Santner J, Leistner C, Saffari A, Pock T, Bischof H. PROST: parallel robust online simple tracking. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2010, 723–730
https://doi.org/10.1109/CVPR.2010.5540145
|
38 |
Gall J, Yao A, Van L, Lempitsky V. Hough forests for object detection, tracking, and action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(11): 2188–2202
https://doi.org/10.1109/TPAMI.2011.70
|
39 |
Zhang L, Maaten L V D. Preserving structure in model-free tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(4): 756–769
https://doi.org/10.1109/TPAMI.2013.221
|
40 |
Duffner S, Garcia C. Pixeltrack: a fast adaptive algorithm for tracking non-rigid objects. International Conference on Computer Vision. 2013, 2480–2487
https://doi.org/10.1109/ICCV.2013.308
|
41 |
Cehovin L, Kristan M, Leonardis A. Robust visual tracking using an adaptive coupled-layer visual model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(4): 941–953
https://doi.org/10.1109/TPAMI.2012.145
|
42 |
Henriques J F, Caseiro R, Martins P, Batista J. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596
https://doi.org/10.1109/TPAMI.2014.2345390
|
43 |
Chen Z, Hong Z B, Tao D C. An experimental survey on correlation filter-based tracking. 2015, arXiv preprint arXiv:1509.05520
|
44 |
Liang P P, Liao C Y, Mei X, Ling H B. Adaptive objectness for object tracking. 2015, arXiv preprint arXiv:1501.00909
|
45 |
Cheng M M, Zhang Z M, Lin W Y, Torr P. BING: binarized normed gradients for objectness estimation at 300fps. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2014, 3286–3293
https://doi.org/10.1109/CVPR.2014.414
|
46 |
Hua Y, Alahari K, Schmid C. Online object tracking with proposal selection. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3092–3100.
https://doi.org/10.1109/ICCV.2015.354
|
47 |
Zhu G, Porikli F, Li H D. Tracking randomly moving objects on Edge Box proposals. 2015, arXiv preprint arXiv:1507.08085
|
48 |
Gan Y, Liu J, Dong J Y, Zhong G Q. A PCA-based convolutional network. 2015, arXiv preprint arXiv:1505.03703
|
49 |
Guo Y W, Chen Y, Tang F, Li A, Luo W T, Liu M M. Object tracking using learned feature manifolds. Computer Vision and Image Understanding, 2014, 118: 128–139
https://doi.org/10.1016/j.cviu.2013.09.007
|
50 |
Fan J L, Xu W, Wu Y, Gong Y H. Human tracking using convolutional neural networks. TEEE Transactions on Neural Networks, 2010, 21(10): 1610–1623
https://doi.org/10.1109/TNN.2010.2066286
|
51 |
Wang N Y, Yeung D Y. Learning a deep compact image representation for visual tracking. In: Proceedings of Neural Information Processing Systems Conference. 2013, 809–817
|
52 |
Wang L, Liu T, Wang G, Chan K L, Yang Q. Video tracking using learned hierarchical features. IEEE Transactions on Image Processing, 2015, 24(4): 1424–1435
https://doi.org/10.1109/TIP.2015.2403231
|
53 |
Li H X, Li Y, Porikli F. Deeptrack: learning discriminative feature representations by convolutional neural networks for visual tracking. In: Proceedings of British Machine Vision Conference. 2014
https://doi.org/10.5244/C.28.56
|
54 |
Wang L J, Ouyang W L, Wang X G, Lu H C. Visual tracking with fully convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3119–3127
https://doi.org/10.1109/ICCV.2015.357
|
55 |
Hong S, You T, Kwak S, Han B. Online tracking by learning discriminative saliency map with convolutional neural network. In: Proceedings of International Conference on Machine Learning. 2015, 597–606
|
56 |
Ma C, Huang J B, Yang X K, Yang M H. Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3074–3082
https://doi.org/10.1109/ICCV.2015.352
|
57 |
Nam H S, Han B Y. Learning multi-domain convolutional neural networks for visual tracking. 2015, arXiv preprint arXiv:1510.07945
|
58 |
Elad M, Figueiredo M A, Ma Y. On the role of sparse and redundant representations in image processing. Proceedings of the IEEE, 2010, 98(6): 972–982
https://doi.org/10.1109/JPROC.2009.2037655
|
59 |
Wu Y, Lim J W, Yang M H. Online object tracking: a benchmark. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2013, 2411–2418
https://doi.org/10.1109/CVPR.2013.312
|
60 |
Yilmaz A, Javed O. Shah M. Object tracking: a survey. ACM Computing Surveys, 2006, 38(4): 13.
https://doi.org/10.1145/1177352.1177355
|
61 |
Dollár P, Zitnick C T. Structured forests for fast edge detection. In: Proceedings of the IEEE International Conference on Computer Vision. 2013, 1841–1848
https://doi.org/10.1109/ICCV.2013.231
|
62 |
Zitnick C L, Dollár P. Edge boxes: locating object proposals from edges. In: Proceedings of European Conference on Compute Vision. 2014, 391–405
https://doi.org/10.1007/978-3-319-10602-1_26
|
63 |
Zhang J M, Ma S G, Sclaroff S. MEEM: robust tracking via multiple experts using entropy minimization. In: Proceedings of European Conference on Compute Vision. 2014
https://doi.org/10.1007/978-3-319-10599-4_13
|
64 |
Henriques J F, Caseiro R, Martins P, Batista J. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596
https://doi.org/10.1109/TPAMI.2014.2345390
|
65 |
Gao J, Ling H, Hu W, Xing J. Transfer learning based visual tracking with gaussian processes regression. In: Proceedings of European Conference on Compute Vision. 2014
https://doi.org/10.1007/978-3-319-10578-9_13
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|