Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

邮发代号 80-970

2019 Impact Factor: 1.275

Frontiers of Computer Science  2022, Vol. 16 Issue (4): 164317   https://doi.org/10.1007/s11704-021-0173-7
  本期目录
Accelerating temporal action proposal generation via high performance computing
Tian WANG1,2, Shiye LEI2, Youyou JIANG3, Choi CHANG4, Hichem SNOUSSI5, Guangcun SHAN6(), Yao FU7
1. Institute of Artificial Intelligence, Beihang University, Beijing 100191, China
2. School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China
3. School of Software, Tsinghua University, Beijing 100084, China
4. Department of Computer Engineering, Gachon University, Seongnam 13120, South Korea
5. Institute Charles Delaunay-LM2S FRE CNRS 2019, University of Technology of Troyes, Troyes 10010, France
6. School of Instrumentation Science and Opto-electronics Engineering, Beihang University, Beijing 100191, China
7. Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
 全文: PDF(8542 KB)   HTML
Abstract

Temporal action proposal generation aims to output the starting and ending times of each potential action for long videos and often suffers from high computation cost. To address the issue, we propose a new temporal convolution network called Multipath Temporal ConvNet (MTCN). In our work, one novel high performance ring parallel architecture based is further introduced into temporal action proposal generation in order to respond to the requirements of large memory occupation and a large number of videos. Remarkably, the total data transmission is reduced by adding a connection between multiplecomputing load in the newly developed architecture. Compared to the traditional Parameter Server architecture, our parallel architecture has higher efficiency on temporal action detection tasks with multiple GPUs. We conduct experiments on ActivityNet-1.3 and THUMOS14, where our method outperformsother state-of-art temporal action detection methods with high recall and high temporal precision. In addition, a time metric is further proposed here to evaluate the speed performancein the distributed training process.

Key wordstemporal convolution    temporal action proposal generation    deep learning
收稿日期: 2020-04-27      出版日期: 2021-11-18
Corresponding Author(s): Guangcun SHAN   
 引用本文:   
. [J]. Frontiers of Computer Science, 2022, 16(4): 164317.
Tian WANG, Shiye LEI, Youyou JIANG, Choi CHANG, Hichem SNOUSSI, Guangcun SHAN, Yao FU. Accelerating temporal action proposal generation via high performance computing. Front. Comput. Sci., 2022, 16(4): 164317.
 链接本文:  
https://academic.hep.com.cn/fcs/CN/10.1007/s11704-021-0173-7
https://academic.hep.com.cn/fcs/CN/Y2022/V16/I4/164317
Fig.1  
Fig.2  
Fig.3  
Fig.4  
Fig.5  
Fig.6  
Fig.7  
Fig.8  
Fig.9  
Fig.10  
Method AR@100 (val) AUC (val)
Zhao et al. [18] 0.653 53.02
Dai et al. [34] ? 59.58
Ghanem et al. [35] ? 63.12
Lin et al. [8] 0.748 66.17
MTN 0.756 67.26
Tab.1  
Method @50 @100 @200 @500
DAPs [11] 13.56 23.83 33.96 49.29
SCNN-prop [12] 17.22 26.17 37.01 51.57
SST [9] 19.90 28.36 37.90 51.58
TURN [17] 19.63 27.96 38.34 53.52
BSN [8] 29.58 37.38 45.55 54.67
MTN 30.61 38.12 46.24 55.31
Tab.2  
Methods @1 @5 @10 @50 @100 AUC
Origin 0.292 0.469 0.549 0.696 0.748 66.17
SE-ConvNet 0.303 0.476 0.553 0.699 0.750 66.57
Mul-DenseNet 0.317 0.482 0.556 0.702 0.751 66.85
MTN 0.332 0.490 0.562 0.706 0.756 67.26
Tab.3  
Fig.11  
Fig.12  
1 K Muhammad , R Hamza , J Ahmad , J Lloret , H Wang , S Baik . Secure surveillance framework for IoT systems using probabilistic image encryption. IEEE Transactions on Industrial Informatics, 2018, 14( 8): 3679– 3689
2 M Sajjad , I U Haq , J Lloret , W Ding , K Muhammad . Robust image hashing based efficient authentication for smart industrial environment. IEEE Transactions on Industrial Informatics, 2019, 15( 12): 6541– 6550
3 T Wang , M Qiao , Z Lin , C Li , H Snoussi , Z Liu , C Choi . Generative neural networks for anomaly detection in crowded scenes. IEEE Transactions on Information Forensics and Security, 2018, 14( 5): 1390– 1399
4 K Muhammad , S Khan , V Palade , I Mehmood , V H De Albuquerque . Edge intelligence-assisted smoke detection in foggy surveillance environments. IEEE Transactions on Industrial Informatics, 2019, 16( 2): 1067– 1075
5 T Wang , Z Miao , Y Chen , Y Zhou , G Shan , H Snoussi . Aed-net: an abnormal event detection network. Engineering, 2019, 5( 5): 930– 939
6 Caba Heilbron F, Escorcia V, Ghanem B, Carlos Niebles J. Activitynet: a large-scale video benchmark for human activity understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 961−970
7 Jiang Y G, Liu J, Zamir A. R, Toderici G, Laptev I, Shah M, Sukthankar R. Thumos challenge: action recognition with a large number of classes. 2014
8 Lin T, Zhao X, Su H, Wang C, Yang M. BSN: boundary sensitive network for temporal action proposal generation. In: Proceedings of the European Conference on Computer Vision. 2018, 3−19
9 Buch S, Escorcia V, Shen C, Ghanem B, Carlos Niebles J. SST: singlestream temporal action proposals. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 2911−2920
10 Caba Heilbron F, Carlos Niebles J, Ghanem B. Fast temporal activity proposals for efficient detection of human actions in untrimmed videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 1914−1923
11 V Escorcia, F. C Heilbron, J. C Niebles, B Ghanem. Daps: deep action proposals for action understanding. In: Proceedings of the European Conference on Computer Vision. 2016, 768– 784
12 Shou Z, Wang D, Chang SF. Temporal action localization in untrimmed videos via multi-stage cnns. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 1049−1058
13 Dean J, Corrado G, Monga R, Chen K, Devin M, Mao M, Senior A, Tucker P, Yang K, Le Q V, et al. Large scale distributed deep networks. In: Proceedings of the Advances in Neural Information Processing Systems. 2012, 1223–1231
14 Karaman S, Seidenari L, Del Bimbo A. Fast saliency based pooling of fisher encoded dense trajectories. In: Proceedings of the European Conference on Computer Vision THUMOS Workshop. 2014
15 L Wang , Y Qiao , X Tang . Action recognition and detection by combining motion and appearance features. THUMOS14 Action Recognition Challenge, 2014, 1( 2): 2–
16 Wang T, Chen Y, Lin Z, Zhu A, Li Y, Snoussi H, Wang H. Recapnet: action proposal generation mimicking human cognitive process. IEEE Transactions on Cybernetics, 2020, DOI:
17 Gao J, Yang Z, Chen K, Sun C, Nevatia R. Turn tap: temporal unit regression network for temporal action proposals. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 3628−3636
18 Zhao Y, Xiong Y, Wang L, Wu Z, Tang X, Lin D. Temporal action detection with structured segment networks. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 2914−2923
19 M Jian , K M Lam , J Dong , L Shen . Visual-patch-attention-aware saliency detection. IEEE Transactions on Cybernetics, 2014, 45( 8): 1575– 1586
20 M Jian , Q Qi , J Dong , Y Yin , K M Lam . Integrating qdwd with pattern distinctness and local contrast for underwater saliency detection. Journal of Visual Communication and Image Representation, 2018, 53 : 31– 41
21 M Jian , Q Qi , H Yu , J Dong , C Cui , X Nie , H Zhang , Y Yin , K M Lam . The extended marine underwater environment database and baseline evaluations. Applied Soft Computing, 2019, 80 : 425– 437
22 T Wang , Y Chen , H Lv , J Teng , H Snoussi , F Tao . Online detection of action start via soft computing for smart city. IEEE Transactions on Industrial Informatics, 2020, 17( 1): 524– 533
23 Wang H, Kläser A, Schmid C, Liu C L. Action recognition by dense trajectories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2011, 3169−3176
24 Feichtenhofer C, Pinz A, Zisserman A. Convolutional two-stream network fusion for video action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 1933−1941
25 L Wang, Y Xiong, Z Wang, Y Qiao, D Lin, X Tang, L Van Gool. Temporal segment networks: towards good practices for deep action recognition. In: Proceedings of the European Conference on Computer Vision. 2016, 20– 36
26 Tran D, Bourdev L, Fergus R, Torresani L, Paluri M. Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 4489−4497
27 J Dean , S Ghemawat . Mapreduce: simplified data processing on large clusters. Communications of the ACM, 2008, 51( 1): 107– 113
28 Y Low , D Bickson , J Gonzalez , C Guestrin , A Kyrola , J M Hellerstein . Distributed graphlab: a framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment, 2012, 5( 8): 716– 727
29 Hu J, Shen L, Sun G. Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 7132−7141
30 Simonyan K, Zisserman A. Two-stream convolutional networks for action recognition in videos. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014, 568−576
31 Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado G S, Davis A, Dean J, Devin M, et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. 2016, arXiv preprint arXiv: 1603.04467
32 Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. 2015, arXiv preprint arXiv: 1502.03167
33 He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770−778
34 X Dai, B Singh, G Zhang, L. S Davis, Chen Y Qiu. Temporal context network for activity localization in videos. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 5793– 5802
35 Ghanem B, Niebles J C, Snoek C, Heilbron F C, Alwassel H, Khrisna R, Escorcia V, Hata K, Buch S. Activitynet challenge 2017 summary. 2017, arXiv preprint arXiv: 1710.08011
[1] Highlights Download
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed