|
|
Learning deep representations for semantic image parsing: a comprehensive overview |
Lili HUANG, Jiefeng PENG, Ruimao ZHANG, Guanbin LI, Liang LIN() |
School of Data and Computer Science, Sun Yat-sen University, Guangzhou 510006, China |
|
|
Abstract Semantic image parsing, which refers to the process of decomposing images into semantic regions and constructing the structure representation of the input, has recently aroused widespread interest in the field of computer vision. The recent application of deep representation learning has driven this field into a new stage of development. In this paper, we summarize three aspects of the progress of research on semantic image parsing, i.e., category-level semantic segmentation, instance-level semantic segmentation, and beyond segmentation. Specifically, we first review the general frameworks for each task and introduce the relevant variants. The advantages and limitations of each method are also discussed. Moreover, we present a comprehensive comparison of different benchmark datasets and evaluation metrics. Finally, we explore the future trends and challenges of semantic image parsing.
|
Keywords
semantic image segmentation
deep learning
convolutional neural networks
image parsing
|
Corresponding Author(s):
Liang LIN
|
Online First Date: 04 September 2018
Issue Date: 21 September 2018
|
|
1 |
Zhao H S, Shi J P, Qi X J, Wang X G, Jia J Y. Pyramid scene parsing network. In: Proceedings of International Conference on Computer Vision and Pattern Recognition. 2017, 2881–2890
https://doi.org/10.1109/CVPR.2017.660
|
2 |
He K, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: Proceedings of IEEE International Conference on Computation Vision. 2017, 2980–2988
https://doi.org/10.1109/ICCV.2017.322
|
3 |
Tu Z, Chen X, Yuille A L, Zhu S C. Image parsing: unifying segmentation, detection, and recognition. International Journal of Computer Vision, 2005, 63(2): 113–140
https://doi.org/10.1007/s11263-005-6642-x
|
4 |
Tu Z, Zhu S C. Parsing images into region and curve processes. In: Proceedings of European Conference on Computer Vision. 2002, 393–407
https://doi.org/10.1007/3-540-47977-5_26
|
5 |
Han F, Zhu S C. Bottom-up/top-down image parsing with attribute grammar. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(1): 59–73
https://doi.org/10.1109/TPAMI.2008.65
|
6 |
Lin L, Wang G, Zhang R, Zhang R, Liang X, Zuo W. Deep structured scene parsing by learning with image descriptions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 2276–2284
https://doi.org/10.1109/CVPR.2016.250
|
7 |
Lowe D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91–110
https://doi.org/10.1023/B:VISI.0000029664.99615.94
|
8 |
Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2005, 886–893
https://doi.org/10.1109/CVPR.2005.177
|
9 |
Ahonen T, Hadid A, Pietikäinen M. Face recognition with local binary patterns. In: Proceedings of European Conference on Computer Vision. 2004, 469–481
https://doi.org/10.1007/978-3-540-24670-1_36
|
10 |
Liu Z, Li X, Luo P, Loy C C, Tang X. Semantic image segmentation via deep parsing network. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 1377–1385
https://doi.org/10.1109/ICCV.2015.162
|
11 |
Chen L C, Papandreou G, Kokkinos I, Murphy K, Yuille A L. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834–848
https://doi.org/10.1109/TPAMI.2017.2699184
|
12 |
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3431–3440
https://doi.org/10.1109/CVPR.2015.7298965
|
13 |
Chen L C, Papandreou G, Kokkinos I, Murphy K, Yuille A L. Semantic image segmentation with deep convolutional nets and fully connected CRFs. 2014, arXiv preprint arXiv:1412.7062
|
14 |
Peng C, Zhang X, Yu G, Luo G, Sun J. Large kernel matters-improve semantic segmentation by global convolutional network. 2017, arXiv preprint arXiv:1703.02719
|
15 |
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems. 2012, 1097–1105
|
16 |
Socher R, Manning C D, Ng A Y. Learning continuous phrase representations and syntactic parsing with recursive neural networks. In: Proceedings of NIPS-2010 Deep Learning and Unsupervised Feature Learning Workshop. 2010, 1–9
|
17 |
Li Y, Qi H, Dai J, Ji X, Wei Y. Fully convolutional instance-aware semantic segmentation. 2016, arXiv preprint arXiv:1611.07709
|
18 |
Bengio Y. Deep learning of representations: looking forward. In: Proceedings of International Conference on Statistical Language and Speech Processing. 2013, 1–37
https://doi.org/10.1007/978-3-642-39593-2_1
|
19 |
Bengio Y. Deep learning of representations for unsupervised and transfer learning. In: Proceedings of ICML Workshop on Unsupervised and Transfer Learning. 2012, 17–36
|
20 |
Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives. IEEE Transactions on Pattern Analysis andMachine Intelligence, 2013, 35(8): 1798–1828
|
21 |
LeCun Y, Boser B, Denker J S, Henderson D, Howard R E, Hubbard W, Jackel L D. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1989, 1(4): 541–551
https://doi.org/10.1162/neco.1989.1.4.541
|
22 |
Dai J, He K, Li Y, Ren S, Sun J. Instance-sensitive fully convolutional networks. In: Proceedings of European Conference on Computer Vision. 2016, 534–549
https://doi.org/10.1007/978-3-319-46466-4_32
|
23 |
Islam M A, Naha S, Rochan M, Bruce N, Wang Y. Label refinement network for coarse-to-fine semantic segmentation. 2017, arXiv preprint arXiv:1703.00551
|
24 |
Lipton Z C, Berkowitz J, Elkan C. A critical review of recurrent neural networks for sequence learning. 2015, arXiv preprint arXiv:1506.00019
|
25 |
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553): 436–444
https://doi.org/10.1038/nature14539
|
26 |
Liang X, Shen X, Xiang D, Feng J, Lin L, Yan S. Semantic object parsing with local-global long short-term memory. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 3185–3193
https://doi.org/10.1109/CVPR.2016.347
|
27 |
Karpathy A, Li F F. Deep visual-semantic alignments for generating image descriptions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3128–3137
https://doi.org/10.1109/CVPR.2015.7298932
|
28 |
Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks. In: Proceedings of Advances in Neural Information Processing Systems. 2014, 3104–3112
|
29 |
Li Z, Gan Y, Liang X, Yu Y, Cheng H, Lin L. LSTM-CF: unifying context modeling and fusion with LSTMS for RGB-D scene labeling. In: Proceedings of European Conference on Computer Vision. 2016, 541–557
https://doi.org/10.1007/978-3-319-46475-6_34
|
30 |
Peng Z, Zhang R, Liang X, Liu X, Lin L. Geometric scene parsing with hierarchical LSTM. 2016, arXiv preprint arXiv:1604.01931
|
31 |
Byeon W, Breuel T M, Raue F, Liwicki M. Scene labeling with LSTM recurrent neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3547–3555
https://doi.org/10.1109/CVPR.2015.7298977
|
32 |
Liang X, Shen X, Feng J, Lin L, Yan S. Semantic object parsing with graph LSTM. In: Proceedings of European Conference on Computer Vision. 2016, 125–143
https://doi.org/10.1007/978-3-319-46448-0_8
|
33 |
Liang X, Lin L, Shen X, Feng J, Yan S, Xing E P. Interpretable structure-evolving LSTM. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2017, 2175–2184
https://doi.org/10.1109/CVPR.2017.234
|
34 |
Zhang R, Yang W, Peng Z, Wang X, Lin L. Progressively diffused networks for semantic image segmentation. 2017, arXiv preprint arXiv:1702.05839
|
35 |
Elman J L. Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning, 1991, 7(2-3): 195–225
https://doi.org/10.1007/BF00114844
|
36 |
Liu W, Rabinovich A, Berg A C. Parsenet: looking wider to see better. 2015, arXiv preprint arXiv:1506.04579
|
37 |
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1–9
https://doi.org/10.1109/CVPR.2015.7298594
|
38 |
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770–778
https://doi.org/10.1109/CVPR.2016.90
|
39 |
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014, arXiv preprint arXiv:1409.1556
|
40 |
Pinheiro P H O, Collobert R. Recurrent convolutional neural networks for scene labeling. In: Proceedings of International Conference on Machine Learning. 2014, 82–90
|
41 |
Graves A, Fernández S, Schmidhuber J. Multi-dimensional recurrent neural networks. In: Proceedings of the International Conference on Artificial Neural Networks. 2007, 549–558
https://doi.org/10.1007/978-3-540-74690-4_56
|
42 |
Lin L, Huang L, Chen T, Gan Y, Cheng H. Knowledge-guided recurrent neural network learning for task-oriented action prediction. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2017, 625–630
https://doi.org/10.1109/ICME.2017.8019345
|
43 |
Farabet C, Couprie C, Najman L, LeCun Y. Learning hierarchical features for scene labeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8): 1915–1929
https://doi.org/10.1109/TPAMI.2012.231
|
44 |
Gupta S, Girshick R, Arbeláez P, Malik J. Learning rich features from RGB-D images for object detection and segmentation. In: Proceedings of European Conference on Computer Vision. 2014, 345–360
https://doi.org/10.1007/978-3-319-10584-0_23
|
45 |
Ning F, Delhomme D, LeCun Y, Piano F, Bottou L, Barbano P E. Toward automatic phenotyping of developing embryos from videos. IEEE Transactions on Image Processing, 2005, 14(9): 1360–1371
https://doi.org/10.1109/TIP.2005.852470
|
46 |
Liang X, Liu S, Shen X, Yang J, Liu L, Dong J, Lin L, Yan S. Deep human parsing with active template regression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(12): 2402–2414
https://doi.org/10.1109/TPAMI.2015.2408360
|
47 |
Liang X, Xu C, Shen X, Yang J, Liu S, Tang J, Lin L, Yan S. Human parsing with contextualized convolutional neural network. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 1386–1394
https://doi.org/10.1109/ICCV.2015.163
|
48 |
Krähenbühl P, Koltun V. Efficientcient inference in fully connected CRFS with gaussian edge potentials. In: Proceedings of Advances in Neural Information Processing Systems. 2011, 109–117
|
49 |
Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 1520–1528
https://doi.org/10.1109/ICCV.2015.178
|
50 |
Badrinarayanan V, Handa A, Cipolla R. Segnet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling. 2015, arXiv preprint arXiv:1505.07293
|
51 |
Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. In: Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention. 2015, 234–241
https://doi.org/10.1007/978-3-319-24574-4_28
|
52 |
Lin G, Milan A, Shen C, Reid I. Refinenet: multi-path refinement networks with identity mappings for high-resolution semantic segmentation. 2016, arXiv preprint arXiv:1611.06612
|
53 |
Chen L C, Papandreou G, Schroff F, Adam H. Rethinking atrous convolution for semantic image segmentation. 2017, arXiv preprint arXiv:1706.05587
|
54 |
Yu F, Koltun V.Multi-scale context aggregation by dilated convolutions. 2015, arXiv preprint arXiv:1511.07122
|
55 |
Li X, Liu Z, Luo P, Loy C C, Tang X. Not all pixels are equal: difficulty-aware semantic segmentation via deep layer cascade. 2017, arXiv preprint arXiv:1704.01344
|
56 |
Zhou Y, Xie L, Shen W, Wang Y, Fishman E K, Yuille A L. A fixedpoint model for pancreas segmentation in abdominal ct scans. In: Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention. 2017, 693–701
|
57 |
Li Q, Wang J, Wipf D, Tu Z. Fixed-point model for structured labeling. In: Proceedings of International Conference on Machine Learning. 2013, 214–221
|
58 |
Wang G, Luo P, Lin L, Wang X. Learning object interactions and descriptions for semantic image segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2017, 5859–5867
https://doi.org/10.1109/CVPR.2017.556
|
59 |
Luo P, Wang G, Lin L, Wang X. Deep dual learning for semantic image segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2017, 2718–2726
https://doi.org/10.1109/ICCV.2017.296
|
60 |
Schwing A G, Urtasun R. Fully connected deep structured networks. 2015, arXiv preprint arXiv:1503.02351
|
61 |
Yang W, Luo P, Lin L. Clothing co-parsing by joint image segmentation and labeling. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 3182–3189
https://doi.org/10.1109/CVPR.2014.407
|
62 |
Byeon W, Liwicki M, Breuel T M. Texture classification using 2D LSTM networks. In: Proceedings of International Conference on Pattern Recognition. 2014, 1144–1149
https://doi.org/10.1109/ICPR.2014.206
|
63 |
Eigen D, Fergus R. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 2650–2658
https://doi.org/10.1109/ICCV.2015.304
|
64 |
Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 580–587
https://doi.org/10.1109/CVPR.2014.81
|
65 |
Reza M, Kosecka J. Reinforcement learning for semantic segmentation in indoor scenes. 2016, arXiv preprint arXiv:1606.01178
|
66 |
Van Oord A, Kalchbrenner N, Kavukcuoglu K. Pixel recurrent neural networks. In: Proceedings of International Conference on Machine Learning. 2016, 1747–1756
|
67 |
Kalchbrenner N, Danihelka I, Graves A. Grid long short-term memory. 2015, arXiv preprint arXiv:1507.01526
|
68 |
Hariharan B, Arbeláez P, Girshick R, Malik J. Simultaneous detection and segmentation. In: Proceedings of European Conference on Computer Vision. 2014, 297–312
https://doi.org/10.1007/978-3-319-10584-0_20
|
69 |
Liang X, Wei Y, Shen X, Jie Z, Feng J, Lin L, Yan S. Reversible recursive instance-level object segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 633–641
https://doi.org/10.1109/CVPR.2016.75
|
70 |
Liang X, Wei Y, Shen X, Yang J, Lin L, Yan S. Proposal-free network for instance-level object segmentation. 2015, arXiv preprint arXiv:1509.02636
|
71 |
Abtahi F, Zhu Z, Burry A M. A deep reinforcement learning approach to character segmentation of license plate images. In: Proceedings of International Conference on Machine Vision Applications. 2015, 539–542
https://doi.org/10.1109/MVA.2015.7153249
|
72 |
Lin L, Wang K, Zuo W, Wang M, Luo J, Zhang L. A deep structured model with radius–margin bound for 3D human activity recognition. International Journal of Computer Vision, 2016, 118(2): 256–273
https://doi.org/10.1007/s11263-015-0876-z
|
73 |
Hariharan B, Arbeláez P, Girshick R, Malik J. Hypercolumns for object segmentation and fine-grained localization. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 447–456
https://doi.org/10.1109/CVPR.2015.7298642
|
74 |
Chen Y T, Liu X, Yang M H. Multi-instance object segmentation with occlusion handling. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3470–3478
https://doi.org/10.1109/CVPR.2015.7298969
|
75 |
Arbeláez P, Pont-Tuset J, Barron J T, Marques F, Malik J. Multiscale combinatorial grouping. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 328–335
https://doi.org/10.1109/CVPR.2014.49
|
76 |
Li G, Xie Y, Lin L, Yu Y. Instance-level salient object segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2017, 247–256
https://doi.org/10.1109/CVPR.2017.34
|
77 |
Dai J, He K, Sun J. Instance-aware semantic segmentation via multitask network cascades. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 3150–3158
|
78 |
Girshick R. Fast R-CNN. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 1440–1448
https://doi.org/10.1109/ICCV.2015.169
|
79 |
He K, Zhang X, Ren S, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Proceedings of European Conference on Computer Vision. 2014, 346–361
https://doi.org/10.1007/978-3-319-10578-9_23
|
80 |
Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of Advances in Neural Information Processing Systems. 2015, 91–99
|
81 |
Newell A, Huang Z, Deng J. Associative embedding: end-to-end learning for joint detection and grouping. In: Proceedings of Advances in Neural Information Processing Systems. 2017, 2274–2284
|
82 |
Harley A W, Derpanis K G, Kokkinos I. Learning dense convolutional embeddings for semantic segmentation. 2015, arXiv preprint arXiv:1511.04377
|
83 |
Fathi A, Wojna Z, Rathod V, Wang P, Song H O, Guadarrama S, Murphy K P. Semantic instance segmentation via deep metric learning. 2017, arXiv preprint arXiv:1703.10277
|
84 |
Yang L, Jin R. Distance metric learning: a comprehensive survey. Michigan State Universiy, 2006, 2(2): 1–51
|
85 |
Xu J, Schwing A G, Urtasun R. Tell me what you see and I will show you where it is. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 3190–3197
https://doi.org/10.1109/CVPR.2014.408
|
86 |
Miller G A, Beckwith R, Fellbaum C, Gross D, Miller K J. Introduction to wordnet: an on-line lexical database. International Journal of Lexicography, 1990, 3(4): 235–244
https://doi.org/10.1093/ijl/3.4.235
|
87 |
Socher R, Bauer J, Manning C D, Ng A Y. Parsing with compositional vector grammars. In: Proceedings of Annual Meeting of the Association for Computational Linguistics. 2013, 455–465
|
88 |
Everingham M, Van Gool L,Williams C K, Winn J, Zisserman A. The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 2010, 88(2): 303–338
https://doi.org/10.1007/s11263-009-0275-4
|
89 |
Chen X, Mottaghi R, Liu X, Fidler S, Urtasun R, Yuille A L. Detect what you can: detecting and representing objects using holistic models and body parts. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1971–1978
https://doi.org/10.1109/CVPR.2014.254
|
90 |
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 2015, 115(3): 211–252
https://doi.org/10.1007/s11263-015-0816-y
|
91 |
Zhou B, Zhao H, Puig X, Fidler S, Barriuso A, Torralba A. Semantic understanding of scenes through the ADE20K dataset. 2016, arXiv preprint arXiv:1608.05442
|
92 |
Lin T Y, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick C L, Dollár P. Microsoft COCO: common objects in context. 2014, arXiv preprint arXiv:1405.0312
|
93 |
Lin T Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick C L. Microsoft COCO: common objects in context. In: Proceedings of European Conference on Computer Vision. 2014, 740–755
https://doi.org/10.1007/978-3-319-10602-1_48
|
94 |
Liu C, Yuen J, Torralba A. Nonparametric scene parsing: label transfer via dense scene alignment. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 1972–1979
https://doi.org/10.1109/CVPR.2009.5206536
|
95 |
Silberman N, Hoiem D, Kohli P, Fergus R. Indoor segmentation and support inference from RGBD images. In: Proceedings of European Conference on Computer Vision. 2012, 746–760
https://doi.org/10.1007/978-3-642-33715-4_54
|
96 |
Gupta S, Arbelaez P, Malik J. Perceptual organization and recognition of indoor scenes from RGB-D images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 564–571
https://doi.org/10.1109/CVPR.2013.79
|
97 |
Song S, Lichtenberg S P, Xiao J. Sun RGB-D: a RGB-D scene understanding benchmark suite. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 567–576
https://doi.org/10.1109/CVPR.2015.7298655
|
98 |
Janoch A, Karayev S, Jia Y, Barron J T, Fritz M, Saenko K, Darrell T. A category-level 3D object dataset: putting the kinect to work. Consumer Depth Cameras for Computer Vision. 2013, 141–165
|
99 |
Xiao J, Owens A, Torralba A. SUN3D: a database of big spaces reconstructed using SFM and object labels. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 1625–1632
https://doi.org/10.1109/ICCV.2013.458
|
100 |
Yamaguchi K, Kiapour M H, Ortiz L E, Berg T L. Parsing clothing in fashion photographs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 3570–3577
https://doi.org/10.1109/CVPR.2012.6248101
|
101 |
Liu S, Feng J, Domokos C, Xu H, Huang J, Hu Z, Yan S. Fashion parsing with weak color-category labels. IEEE Transactions on Multimedia, 2014, 16(1): 253–265
https://doi.org/10.1109/TMM.2013.2285526
|
102 |
Dong J, Chen Q, Xia W, Huang Z, Yan S. A deformable mixture parsing model with parselets. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 3408–3415
https://doi.org/10.1109/ICCV.2013.423
|
103 |
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B. The cityscapes dataset for semantic urban scene understanding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 3213–3223
https://doi.org/10.1109/CVPR.2016.350
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|