|
|
|
Video structural description technology for the new generation video surveillance systems |
Chuanping HU,Zheng XU( ),Yunhuai LIU,Lin MEI |
| The Third Research Institute of the Ministry of Public Security, Shanghai 200031, China |
|
|
|
|
Abstract The increasing need of video based applications issues the importance of parsing and organizing the content in videos. However, the accurate understanding and managing video contents at the semantic level is still insufficient. The semantic gap between low level features and high level semantics cannot be bridged by manual or semi-automatic methods. In this paper, a semantic based model named video structural description (VSD) for representing and organizing the content in videos is proposed. Video structural description aims at parsing video content into the text information, which uses spatiotemporal segmentation, feature selection, object recognition, and semantic web technology. The proposed model uses the predefined ontologies including concepts and their semantic relations to represent the contents in videos. The defined ontologies can be used to retrieve and organize videos unambiguously. In addition, besides the defined ontologies, the semantic relations between the videos are mined. The video resources are linked and organized by their related semantic relations.
|
| Keywords
video structural description
video content extraction
video resources management
domain ontology
|
|
Corresponding Author(s):
Zheng XU
|
|
Just Accepted Date: 20 May 2015
Issue Date: 10 November 2015
|
|
| 1 |
Xu Z, Liu Y, Mei L, Hu C, Chen L. Semantic based representing and organizing surveillance big data using video structural description technology. Journal of Systems and Software, 2015, 102: 217−225
https://doi.org/10.1016/j.jss.2014.07.024
|
| 2 |
Hu C, Xu Z, Liu Y, Mei L, Chen L, Luo X. Semantic link networkbased model for organizing multimedia big data. IEEE Transactions on Emerging Topics in Computing, 2014, 2(3): 376−387
https://doi.org/10.1109/TETC.2014.2316525
|
| 3 |
Wu L, Wang Y. The process of criminal investigation based on grey hazy set. In: Proceedings of IEEE International Conference on System Man and Cybernetics. 2010, 26−28
|
| 4 |
Liu L, Li Z, Delp E J. Efficient and low-complexity surveillance video compression using backward-channel aware Wyner-Ziv video coding. IEEE Transactions on Circuits and Systems for Video Technology, 2009, 19(4): 452−465
|
| 5 |
Zhang J, Zulkernine M, Haque A. Random-forests-based network intrusion detection systems. IEEE Transactions on Systems, Man, and Cybernetics (Part C: Applications and Reviews), 2008, 38(5): 649−659
https://doi.org/10.1109/TSMCC.2008.923876
|
| 6 |
Yu H Q, Pedrinaci C, Dietze S, Domingue J. Using linked data to annotate and search educational video resources for supporting distance learning. IEEE Transactions on Learning Technologies, 2012, 5(2): 130−142
https://doi.org/10.1109/TLT.2012.1
|
| 7 |
Xu C, Zhang Y F, Zhu G, Rui Y, Lu H, Huang Q. Using webcast text for semantic event detection in broadcast sports video. IEEE Transactions on Multimedia, 2008, 10(7): 1342−1355
https://doi.org/10.1109/TMM.2008.2004912
|
| 8 |
Berners-Lee T, Hendler J, Lassila O. The semantic web. Scientific American, 2001, 284(5): 34−43
https://doi.org/10.1038/scientificamerican0501-34
|
| 9 |
Ma H, Zhu J, Lyu M R T, King I. Bridging the semantic gap between image contents and tags. IEEE Transactions on Multimedia, 2010, 12(5): 462−473
https://doi.org/10.1109/TMM.2010.2051360
|
| 10 |
Chen H T, Ahuja N. Exploiting nonlocal spatiotemporal structure for video segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 741−748
|
| 11 |
Javed K, Babri H, Saeed M. Feature selection based on class-dependent densities for high-dimensional binary data. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(3): 465−477
https://doi.org/10.1109/TKDE.2010.263
|
| 12 |
Choi M, Torralba A, Willsky A. A Tree-based context model for object recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(2): 240−252
https://doi.org/10.1109/TPAMI.2011.119
|
| 13 |
Luo X, Xu Z, Yu J, Chen X. Building association link network for semantic link on web resources. IEEE transactions on automation science and engineering, 2011, 8(3): 482−494
https://doi.org/10.1109/TASE.2010.2094608
|
| 14 |
Xu Z, Luo X, Wang L. Incremental building association link network. Computer Systems Science and Engineering, 2011, 26(3): 153−162
|
| 15 |
Liu Y, Zhu Y, Ni L M, Xue G. A reliability-oriented transmission service in wireless sensor networks. IEEE Transactions on Parallel and Distributed Systems, 2011, 22(12): 2100−2107
https://doi.org/10.1109/TPDS.2011.113
|
| 16 |
Liu Y, Zhang Q, Ni L M. Opportunity-based topology control in wireless sensor networks. IEEE Transactions on Parallel and Distributed Systems, 2010, 21(3): 405−416
https://doi.org/10.1109/TPDS.2009.57
|
| 17 |
Donderler M, Saykol E, Arslan U, Ulusoy O, Gudukbay U. Bilvideo: design and implementation of a video database management system. Multimedia Tools Applications, 2005, 27(1): 79−104
https://doi.org/10.1007/s11042-005-2715-7
|
| 18 |
Sevilmis T, Bastan M, Gudukbay U, Ulusoy O. Automatic detection of salient objects and spatial relations in videos for a video database system. Image Vision Computing, 2008, 26(10): 1384−1396
https://doi.org/10.1016/j.imavis.2008.01.001
|
| 19 |
Fan J, Aref W G, Elmagarmid A K, Hacid M S, Marzouk M S, Zhu X. Multiview: multilevel video content representation and retrieval. Journal of Electronic Imaging, 2001, 10(4): 895−908
https://doi.org/10.1117/1.1406944
|
| 20 |
Fan J, Elmagarmid A K, Zhu X, Aref W G, Wu L. Classview: hierarchical video shot classification, indexing, and accessing. IEEE Transactions on Multimedia, 2004, 6(1): 70−86
https://doi.org/10.1109/TMM.2003.819583
|
| 21 |
Bai L, Lao S, Jones G J, Smeaton A F. Video semantic content analysis based on ontology. In: Proceedings of the 11th International Machine Vision and Image Processing Conference. 2007, 117−124
https://doi.org/10.1109/imvip.2007.13
|
| 22 |
Nevatia R, Natarajan P. EDF: a framework for semantic annotation of video. In: Proceedings of the 10th IEEE International Conference on Computer Vision Workshops. 2005, 1876
|
| 23 |
Bagdanov A D, Bertini M, Del Bimbo A, Torniai C, Serra G. Semantic annotation and retrieval of video events using multimedia ontologies. In: Proceedings of IEEE International Conference on Semantic Computing. 2007, 713−720
https://doi.org/10.1109/icsc.2007.30
|
| 24 |
Francois A R, Nevatia R, Hobbs J, Bolles R, Smith J R. VERL: an ontology framework for representing and annotating video events. IEEE Multimedia, 2005, 12(4): 76−86
https://doi.org/10.1109/MMUL.2005.87
|
| 25 |
Akdemir U, Turaga P, Chellappa R. An ontology based approach for activity recognition from video. In: Proceedings of the ACM International Conference on Multimedia. 2008, 709−712
https://doi.org/10.1145/1459359.1459466
|
| 26 |
Marszalek M, Schmid C. Semantic hierarchies for visual object recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2007, 1−7
https://doi.org/10.1109/cvpr.2007.383272
|
| 27 |
Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L. Imagenet: a largescale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248−255
|
| 28 |
Yao B, Yang X, Lin L, Lee M W, Zhu S C. I2t: image parsing to text description. Proceedings of the IEEE, 2010, 98(8): 1485−1508
https://doi.org/10.1109/JPROC.2010.2050411
|
| 29 |
Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1−8
https://doi.org/10.1109/cvpr.2008.4587597
|
| 30 |
Felzenszwalb P, Girshick R, McAllester D, Ramanan D. Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627−1645
https://doi.org/10.1109/TPAMI.2009.167
|
| 31 |
Felzenszwalb P F, Girshick R B, McAllester D. Cascade object detection with deformable part models. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2010, 2241−2248
https://doi.org/10.1109/cvpr.2010.5539906
|
| 32 |
Chen N, Zhou Q Y, and Prasanna V. Understanding web image by object relation network. In: Proceedings of the 21st International Conference on World Wide Web. 2012, 291−300
https://doi.org/10.1145/2187836.2187876
|
| 33 |
Kulkarni G, Premraj V, Dhar S, Li S, Choi Y, Berg A C, Berg T L. Baby talk: understanding and generating image descriptions. In: Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition. 2011
https://doi.org/10.1109/cvpr.2011.5995466
|
| 34 |
Qi G J, Aggarwal C, Huang T. Towards semantic knowledge propagation from text corpus to web images. In: Proceedings of the 20th International Conference on World Wide Web. 2011, 297−306
https://doi.org/10.1145/1963405.1963449
|
| [1] |
Supplementary Material-Highlights in 3-page ppt
|
Download
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
| |
Shared |
|
|
|
|
| |
Discussed |
|
|
|
|