Real-time visual tracking using complementary kernel support correlation filters
Zhenyang SU1,2, Jing LI1(), Jun CHANG1, Bo DU1, Yafu XIAO1
1. School of Computer Science, Wuhan University,Wuhan 430072, China 2. Department of Digital Media Technology, Huanggang Normal University, Huangzhou 438000, China
Despite demonstrated success of SVM based trackers, their performance remains a boosting room if carefully considering the following factors: first, the tradeoff between sampling and budgeting samples affects tracking accuracy and efficiency much; second, how to effectively fuse different types of features to learn a robust target representation plays a key role in tracking accuracy. In this paper, we propose a novel SVM based tracking method that handles the first factor with the help of the circulant structures of the samples and the second one by a multi-kernel learning mechanism. Specifically, we formulate an SVM classification model for visual tracking that incorporates two types of kernels whose matrices are circulant, fully taking advantage of the complementary traits of the color and HOG features to learn a robust target representation. Moreover, it is fortunate that the SVM model has a closed-form solution in terms of both the classifier weights and the kernel weights, and both can be efficiently computed via fast Fourier transforms (FFTs). Extensive evaluations on OTB100 and VOT2016 visual tracking benchmarks demonstrate that the proposed method achieves a favorable performance against various state-of-the-art trackers with a speed of 50 fps on a single CPU.
. [J]. Frontiers of Computer Science, 2020, 14(2): 417-429.
Zhenyang SU, Jing LI, Jun CHANG, Bo DU, Yafu XIAO. Real-time visual tracking using complementary kernel support correlation filters. Front. Comput. Sci., 2020, 14(2): 417-429.
X Li, W Hu, C Shen, Z Zhang, A Dick, A V D Hengel. A survey of appearance models in visual object tracking. ACM Transactions on Intelligent Systems and Technology, 2013, 4(4): 1–48 https://doi.org/10.1145/2508037.2508039
2
Y Song, C Ma, X Wu, L Gong, L Bao, W Zuo, C Shen, R Lau, M H Yang. VITAL: visual tracking via adversarial learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 8990–8999 https://doi.org/10.1109/CVPR.2018.00937
3
K Zhang, L Zhang, M H Yang. Real-time compressive tracking. In: Proceedings of European Conference on Computer Vision. 2012, 864–877 https://doi.org/10.1007/978-3-642-33712-3_62
4
X Jia, H Lu, M H Yang. Visual tracking via adaptive structural local sparse appearance model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2012, 1822–1829
5
K Zhang, L Zhang, M H Yang. Fast compressive tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(10): 2002–2015 https://doi.org/10.1109/TPAMI.2014.2315808
6
W Zhong, H Lu, M H Yang. Robust object tracking via sparsity-based collaborative model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2012, 1838–1845
7
W Chen, K Zhang, Q Liu. Robust visual tracking via patch based kernel correlation filters with adaptive multiple feature ensemble. Neurocomputing, 2016, 214: 607–617 https://doi.org/10.1016/j.neucom.2016.06.048
8
J Yang, K Zhang, Q Liu. Robust object tracking by online fisher discrimination boosting feature selection. Computer Vision and Image Understanding, 2016, 153: 100–108 https://doi.org/10.1016/j.cviu.2016.02.003
9
H Song, Y Zheng, K Zhang. Robust visual tracking via self-similarity learning. Electronics Letters, 2016, 53(1): 20–22 https://doi.org/10.1049/el.2016.3011
K Zhang, Q Liu, H Song, X Li. A variational approach to simultaneous image segmentation and bias correction. IEEE Transactions on Cybernetics, 2015, 45(8): 1426–1437 https://doi.org/10.1109/TCYB.2014.2352343
12
K Zhang, L Zhang, K M Lam, D Zhang. A level set approach to image segmentation with intensity inhomogeneity. IEEE Transactions on Cybernetics, 2016, 46(2): 546–557 https://doi.org/10.1109/TCYB.2015.2409119
13
H Song. Robust visual tracking via online informative feature selection. Electronics Letters, 2014, 50(25): 1931–1933 https://doi.org/10.1049/el.2014.1911
14
K Zhang, H Song. Real-time visual tracking via online weighted multiple instance learning. Pattern Recognition, 2013, 46(1): 397–411 https://doi.org/10.1016/j.patcog.2012.07.013
15
K Zhang, Q Liu, Y Wu, M H Yang. Robust visual tracking via convolutional networks without training. IEEE Transactions on Image Processing, 2016, 25(4): 1779–1792
16
Y Wu, J Lim, M H Yang. Online object tracking: a benchmark. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 2411–2418 https://doi.org/10.1109/CVPR.2013.312
17
N Wang, J Shi, D Y Yeung, J Jia. Understanding and diagnosing visual tracking systems. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 3101–3109 https://doi.org/10.1109/ICCV.2015.355
18
S Avidan. Support vector tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(8): 1064–1072 https://doi.org/10.1109/TPAMI.2004.53
19
Y Bai, M Tang. Robust tracking via weakly supervised ranking SVM. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 1854–1861
20
S Hare, S Golodetz, A Saffari, V Vineet, M M Cheng, S L Hicks, P H Torr. Struck: structured output tracking with kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(10): 2096–2109 https://doi.org/10.1109/TPAMI.2015.2509974
21
J Zhang, S Ma, S Sclaroff. MEEM: robust tracking via multiple experts using entropy minimization. In: Proceedings of European Conference on Computer Vision. 2014, 188–203 https://doi.org/10.1007/978-3-319-10599-4_13
22
H Song, B Huang, Q Liu, K Zhang. Improving the spatial resolution of landsat TM/ETM+ through fusion with SPOT5 images via learningbased super-resolution. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53(3): 1195–1204 https://doi.org/10.1109/TGRS.2014.2335818
23
J Ning, J Yang, S Jiang, L Zhang, M H Yang. Object tracking via dual linear structured SVM and explicit feature map. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 4266–4274 https://doi.org/10.1109/CVPR.2016.462
24
H Song. Active contours driven by regularised gradient flux flows for image segmentation. Electronics Letters, 2014, 50(14): 992–994 https://doi.org/10.1049/el.2014.1710
25
W M Zuo, X H Wu, L Lin, L Zhang, M H Yang. Learning support correlation filters for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(5): 1158–1172 https://doi.org/10.1109/TPAMI.2018.2829180
26
H Song, Q Liu, G Wang, R Hang, B Huang. Spatiotemporal satellite image fusion using deep convolutional neural networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2018, 11(3): 821–829 https://doi.org/10.1109/JSTARS.2018.2797894
27
H Song, Y Zheng, K Zhang. Efficient algorithm for piecewise-smooth model with approximately explicit solutions. Electronics Letters, 2017, 53(4): 233–235 https://doi.org/10.1049/el.2016.4241
28
M Wang, Y Liu, Z Huang. Large margin object tracking with circulant feature maps. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 4021–4029 https://doi.org/10.1109/CVPR.2017.510
29
J F Henriques, R Caseiro, P Martins, J Batista. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596 https://doi.org/10.1109/TPAMI.2014.2345390
30
L Bertinetto, J Valmadre, S Golodetz, O Miksik, P H Torr. Staple: complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 1401–1409 https://doi.org/10.1109/CVPR.2016.156
31
K Zhang, L Zhang, Q Liu, D Zhang, M H Yang. Fast visual tracking via dense spatio-temporal context learning. In: Proceedings of European Conference on Computer Vision. 2014, 127–141 https://doi.org/10.1007/978-3-319-10602-1_9
32
H Song, G Wang, A Cao, Q Liu, B Huang. Improving the spatial resolution of FY-3 microwave radiation imager via fusion with FY-3/MERSI. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2017, 10(7): 3055–3063 https://doi.org/10.1109/JSTARS.2017.2665524
33
D S Bolme, J R Beveridge, B A Draper, Y M Lui. Visual object tracking using adaptive correlation filters. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2010, 2544–2550 https://doi.org/10.1109/CVPR.2010.5539960
K Zhang, X Li, H Song, Q Liu, W Lian. Visual tracking using spatiotemporally nonlocally regularized correlation filter. Pattern Recognition, 2018, 83: 185–195 https://doi.org/10.1016/j.patcog.2018.05.017
36
M Tang, J Feng. Multi-kernel correlation filter for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3038–3046 https://doi.org/10.1109/ICCV.2015.348
37
H Song, G Wang, K Zhang. Multiple change detection for multispectral remote sensing images via joint sparse representation. Optical Engineering, 2014, 53(12): 123103 https://doi.org/10.1117/1.OE.53.12.123103
38
Y Qi, L Qin, J Zhang, S Zhang, Q Huang, M H Yang. Structure-aware local sparse coding for visual tracking. IEEE Transactions on Image Processing, 2018, 27(8): 3857–3869 https://doi.org/10.1109/TIP.2018.2797482
39
C Sun, D Wang, H Lu, M H Yang. Learning spatial-aware regressions for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 8962–8970 https://doi.org/10.1109/CVPR.2018.00934
40
Y Qi, S Zhang, L Qin, Q Huang, H Yao, J Lim, M H Yang. Hedging deep features for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(5): 1116–1130 https://doi.org/10.1109/TPAMI.2018.2828817
41
C Sun, D Wang, H Lu, M H Yang. Correlation tracking via joint discrimination and reliability learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 489–497 https://doi.org/10.1109/CVPR.2018.00058
42
T Zhang, S Liu, C Xu, B Liu, M H Yang. Correlation particle filter for visual tracking. IEEE Transactions on Image Processing, 2018, 27(6): 2676–2687 https://doi.org/10.1109/TIP.2017.2781304
43
T Zhang, C Xu, M H Yang. Learning multi-task correlation particle filters for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(2): 365–378 https://doi.org/10.1109/TPAMI.2018.2797062
44
C Ma, J B Huang, X Yang, M H Yang. Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3074–3082 https://doi.org/10.1109/ICCV.2015.352
45
M Varma, D Ray. Learning the discriminative power-invariance tradeoff. In: Proceedings of International Conference on Computer Vision. 2007, 1–8 https://doi.org/10.1109/ICCV.2007.4408875
46
Y Wu, J Lim, M H Yang. Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834–1848 https://doi.org/10.1109/TPAMI.2014.2388226
47
F Tang, S Brennan, Q Zhao, H Tao. Co-tracking using semi-supervised support vector machines. In: Proceedings of International Conference on Computer Vision. 2007, 1–8 https://doi.org/10.1109/ICCV.2007.4408954
48
X Li, A Dick, H Wang, C Shen, A Van Den Hengel. Graph mode-based contextual kernels for robust SVM tracking. In: Proceedings of International Conference on Computer Vision. 2011, 1156–1163 https://doi.org/10.1109/ICCV.2011.6126364
49
J S Supancic, D Ramanan. Self-paced learning for long-term tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 2379–2386 https://doi.org/10.1109/CVPR.2013.308
50
M Danelljan, F Shahbaz Khan, M Felsberg, J Van De Weijer. Adaptive color attributes for real-time visual tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1090–1097 https://doi.org/10.1109/CVPR.2014.143
51
Y Li, J Zhu. A scale adaptive kernel correlation filter tracker with feature integration. In: Proceedings of European Conference on Computer Vision. 2014, 254–265 https://doi.org/10.1007/978-3-319-16181-5_18
52
C Ma, X Yang, C Zhang, M H Yang. Long-term correlation tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 5388–5396 https://doi.org/10.1109/CVPR.2015.7299177
53
Z Hong, Z Chen, C Wang, X Mei, D Prokhorov, D Tao. Multi-store tracker (muster): a cognitive psychology inspired approach to object tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 749–758 https://doi.org/10.1109/CVPR.2015.7298675
54
J Choi, H Jin Chang, J Jeong, Y Demiris, J Young Choi. Visual tracking using attention-modulated disintegration and integration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 4321–4330 https://doi.org/10.1109/CVPR.2016.468
55
M Danelljan, G Hager, F Shahbaz Khan, M Felsberg. Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 4310–4318 https://doi.org/10.1109/ICCV.2015.490
56
Y Qi, S Zhang, L Qin, H Yao, Q Huang, J Lim, M H Yang. Hedged deep tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 4303–4311 https://doi.org/10.1109/CVPR.2016.466
57
S Liu, T Zhang, X Cao, C Xu. Structural correlation filter for robust visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 4312–4320 https://doi.org/10.1109/CVPR.2016.467
58
T Zhang, C Xu, M H Yang. Multi-task correlation particle filter for robust object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 4335–4343 https://doi.org/10.1109/CVPR.2017.512
59
H Kiani Galoogahi, A Fagg, S Lucey. Learning background-aware correlation filters for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 1135–1143 https://doi.org/10.1109/ICCV.2017.129
60
M Mueller, N Smith, B Ghanem. Context-aware correlation filter tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 1396–1404 https://doi.org/10.1109/CVPR.2017.152
61
A Lukezic, T Vojír, L C Zajc, J Matas, M Kristan. Discriminative correlation filter with channel and spatial reliability. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 6309–6318 https://doi.org/10.1109/CVPR.2017.515
62
B Scholkopf, A J Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Massachusetts: MIT Press, 2001
63
G R Lanckriet, T De Bie, N Cristianini, M I Jordan, W S Noble. A statistical framework for genomic data fusion. Bioinformatics, 2004, 20(16): 2626–2635 https://doi.org/10.1093/bioinformatics/bth294
64
M Kristan, A Leonardis, J Matas, M Felsberg, R Pflugfelder, L Čehovin, T Vojír, G Häger, A Lukežiˇc, G Fernández, others. The visual object tracking VOT2016 challenge results. In: Proceedings of European Conference on Computer Vision Workshops. 2016, 777–823