|
|
Crowd counting via learning perspective for multi-scale multi-view Web images |
Chong SHANG1, Haizhou AI1( ), Yi YANG2 |
1. Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China 2. Huawei Technologies, Beijing 100084, China |
|
|
Abstract Estimating the number of people in Web images still remains a challenging problem owing to the perspective variation, different views, and diverse backgrounds. Existing deep learning models still have difficulties in dealing with scenarios where the size of a person is either extremely large or extremely small. In this paper, we propose a novel perspective-aware architecture to estimate the number of people in a crowd in web images. Specifically,we use a two-stage framework, where we first learn a policy network to infer the perspective of the target scene, which outputs a scale label for the subsequent perspective normalization. Next, given the aligned inputs, we further adjust the scale-specific counting network to regress the final count. Experiments on challenging datasets demonstrate our approach can deal with a large perspective variation and that we have achieved state-of-theart results.
|
Keywords
crowd counting
Web images
perspective inference
|
Corresponding Author(s):
Haizhou AI
|
Just Accepted Date: 13 June 2017
Online First Date: 06 July 2018
Issue Date: 24 April 2019
|
|
1 |
SAli, MShah. A lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2007
https://doi.org/10.1109/CVPR.2007.382977
|
2 |
JShao, KKang, CChange Loy, XWang. Deeply learned attributes for crowded scene understanding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 4657–4666
https://doi.org/10.1109/CVPR.2015.7299097
|
3 |
HIdrees, KSoomro, MShah. Detecting humans in dense crowds using locally-consistent scale prior and global occlusion reasoning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(10): 1986–1998
https://doi.org/10.1109/TPAMI.2015.2396051
|
4 |
VLempitsky, A Zisserman. Learning to count objects in images. In: Proceedings of the Neural Information Processing Systems Conference. 2010, 1324–1332
|
5 |
A BChan, Z S JLiang, NVasconcelos. Privacy preserving crowd monitoring: counting people without people models or tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008
https://doi.org/10.1109/CVPR.2008.4587569
|
6 |
HIdrees, I Saleemi, CSeibert, MShah. Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 2547–2554
https://doi.org/10.1109/CVPR.2013.329
|
7 |
ZMa, A BChan. Crossing the line: crowd counting by integer programming with local features. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 2539–2546
https://doi.org/10.1109/CVPR.2013.328
|
8 |
C CLoy, SGong, TXiang. From semisupervised to transfer counting of crowds. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 2256–2263
|
9 |
KChen, SGong, TXiang, C C Loy. Cumulative attribute space for age and crowd density estimation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 2467–2474
https://doi.org/10.1109/CVPR.2013.319
|
10 |
LFiaschi, U Köthe, RNair, F AHamprecht. Learning to count with regression forest and structured labels. In: Proceedings of the 21st IEEE International Conference on Pattern Recognition. 2012, 2685–2688
|
11 |
KChen, C CLoy, SGong, T Xiang. Feature mining for localised crowd counting. In: Proceedings of the British Machine Vision Conference. 2012
https://doi.org/10.5244/C.26.21
|
12 |
CShang, HAi, BBai. End-to-end crowd counting via joint learning local and global count. In: Proceedings of the International Conference on Image Processing. 2016, 1215–1219
https://doi.org/10.1109/ICIP.2016.7532551
|
13 |
YZhang, DZhou, SChen, S Gao, YMa. Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 589–597
https://doi.org/10.1109/CVPR.2016.70
|
14 |
DOnoro-Rubio, R J López-Sastre. Towards perspective-free object counting with deep learning. In: Proceedings of the European Conference on Computer Vision. 2016, 615–629
https://doi.org/10.1007/978-3-319-46478-7_38
|
15 |
CZhang, HLi, XWang, X Yang. Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 833–841
https://doi.org/10.1109/CVPR.2015.7298684
|
16 |
VRabaud, S Belongie. Counting crowded moving objects. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2006, 705–711
https://doi.org/10.1109/CVPR.2006.92
|
17 |
XWu, GLiang, K KLee, Y Xu. Crowd density estimation using texture analysis and learning. In: Proceedings of IEEE International Conference on Robotics and Biomimetics. 2006, 214–219
https://doi.org/10.1109/ROBIO.2006.340379
|
18 |
DKong, DGray, HTao. A viewpoint invariant approach for crowd counting. In: Proceedings of the 18th IEEE International Conference on Pattern Recognition. 2006, 1187–1190
https://doi.org/10.1109/ICPR.2006.197
|
19 |
YCong, HGong, S CZhu, Y Tang. Flow mosaicking: real-time pedestrian counting without scene-specific learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 1093–1100
https://doi.org/10.1109/CVPR.2009.5206648
|
20 |
N CTang, Y Y,Lin M FWeng, H Y M Liao. Cross-camera knowledge transfer for multiview people counting. IEEE Transactions on Image Processing, 2015, 24(1): 80–93
https://doi.org/10.1109/TIP.2014.2363445
|
21 |
ZZhang, MWang, XGeng. Crowd counting in public video surveillance by label distribution learning. Elsevier Neurocomputing, 2015, 166: 151–163
https://doi.org/10.1016/j.neucom.2015.03.083
|
22 |
BLiu, N Vasconcelos. Bayesian model adaptation for crowd counts. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 4175–4183
https://doi.org/10.1109/ICCV.2015.475
|
23 |
CArteta, V Lempitsky, J ANoble, AZisserman. Interactive object counting. In: Proceedings of the European Conference on Computer Vision. 2014, 504–518
https://doi.org/10.1007/978-3-319-10578-9_33
|
24 |
V QPham, T Kozakaya, OYamaguchi, ROkada. Count forest: covoting uncertain number of targets using random forest for crowd density estimation. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3253–3261
https://doi.org/10.1109/ICCV.2015.372
|
25 |
P FFelzenszwalb, D P. HuttenlocherEfficient belief propagation for early vision. International Journal of Computer Vision, 2006, 70(1): 41–54
https://doi.org/10.1007/s11263-006-7899-4
|
26 |
CSzegedy, WLiu, YJia, P Sermanet, SReed, DAnguelov, DErhan, VVanhoucke, A Rabinovich. Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015
https://doi.org/10.1109/CVPR.2015.7298594
|
27 |
KHe, XZhang, SRen, J Sun. Deep residual learning for image recognition. 2015, arXiv preprint arXiv:1512.03385
|
28 |
KSimonyan, A Zisserman. Very deep convolutional networks for large-scale image recognition. 2014, arXiv preprint arXiv:1409.1556
|
29 |
DKingma, JBa. Adam: a method for stochastic optimization. 2014, arXiv preprint arXiv:1412.6980
|
30 |
MRodriguez, JSivic, ILaptev, J Y Audibert. Data-driven crowd analysis in videos. In: Proceedings of IEEE International Conference on Computer Vision. 2011, 1235–1242
https://doi.org/10.1109/ICCV.2011.6126374
|
31 |
SAn, WLiu, SVenkatesh. Face recognition using kernel ridge regression. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2007
https://doi.org/10.1109/CVPR.2007.383105
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|