1. College of Computer Science, Zhejiang University, Hangzhou 310027, China 2. Fujian Key Laboratory of Sensing and Computing for Smart Cities, School of Information Science and Engineering, Xiamen University, Xiamen 361005, China
Crime risk prediction is helpful for urban safety and citizens’ life quality. However, existing crime studies focused on coarse-grained prediction, and usually failed to capture the dynamics of urban crimes. The key challenge is data sparsity, since that 1) not all crimes have been recorded, and 2) crimes usually occur with low frequency. In this paper, we propose an effective framework to predict fine-grained and dynamic crime risks in each road using heterogeneous urban data. First, to address the issue of unreported crimes, we propose a cross-aggregation soft-impute (CASI) method to deal with possible unreported crimes. Then, we use a novel crime risk measurement to capture the crime dynamics from the perspective of influence propagation, taking into consideration of both time-varying and location-varying risk propagation. Based on the dynamically calculated crime risks, we design contextual features (i.e., POI distributions, taxi mobility, demographic features) from various urban data sources, and propose a zero-inflated negative binomial regression (ZINBR) model to predict future crime risks in roads. The experiments using the real-world data from New York City show that our framework can accurately predict road crime risks, and outperform other baseline methods.
B Zhou, L Chen, S Zhao, F Zhou, S Li, G Pan. Spatio-temporal analysis of urban crime leveraging multisource crowdsensed data. Personal and Ubiquitous Computing, 2020 https://doi.org/10.1007/S00779-020-01456-6
4
N Y C P Department. Nypd complaint data, 2018
5
Crime-recording: making the victim count. HMIC, November 2014
6
M Masucci, L Langton. Hate crime victimization, 2004-2015. Special Report.(No. NCJ 250653). Washington, DC: Bureau of Justice Statistics. US Department of Justice, 2017
7
M Planty, L Langton, C Krebs, M Berzofsky, H Smiley-McDonald. Female victims of sexual violence, 1994-2010. Special Report (No. NCJ 240655). Washington, DC: Bureau of Justice Statistics. US Department of Justice, 2013
Z Jiang, Y Liu, X Fan, C Wang, J Li, L Chen. Understanding urban structures and crowd dynamics leveraging large-scale vehicle mobility data. Frontiers of Computer Science, 2020, 14 (5): 1- 12
10
C Chen, L Gao, X Xie, Z Wang. Enjoy the most beautiful scene now: a memetic algorithm to solve two-fold time-dependent arc orienteering problem. Frontiers of Computer Science, 2020, 14 (2): 364- 377 https://doi.org/10.1007/s11704-019-8364-1
11
F Yi, Z Yu, H Chen, H Du, B Guo. Cyber-physical-social collaborative sensing: from single space to cross-space. Frontiers of Computer Science, 2018, 12 (4): 609- 622 https://doi.org/10.1007/s11704-017-6612-9
12
R L Block, C R Block. Space, place and crime: hot spot areas and hot places of liquor-related crime. Crime and Place, 1995, 4 (2): 145- 184
13
L E Cohen, M Felson. Social change and crime rate trends: a routine activity approach. American Sociological Review, 1979, 44 (4): 588- 608 https://doi.org/10.2307/2094589
R Mazumder, T Hastie, R Tibshirani. Spectral regularization algorithms for learning large incomplete matrices. Journal of Machine Learning Research, 2010, 11: 2287- 2322
17
G O Mohler, M B Short, P J Brantingham, F P Schoenberg, G E Tita. Self-exciting point process modeling of crime. Journal of the American Statistical Association, 2011, 106 (493): 100- 108 https://doi.org/10.1198/jasa.2011.ap09546
18
C H Yu, W Ding, P Chen, M Morabito. Crime forecasting using spatiotemporal pattern with ensemble learning. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2014, 174- 185
19
F Yi, Z Yu, F Zhuang, X Zhang, H Xiong. An integrated model for crime prediction using temporal and spatial factors. In: Proceedings of IEEE International Conference on Data Mining. 2018, 1386- 1391 https://doi.org/10.1109/ICDM.2018.00190
20
X Zhao, J Tang. Modeling temporal-spatial correlations for crime prediction. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2017, 497- 506
21
C Huang, J Zhang, Y Zheng, N V Chawla. Deepcrime: attentive hierarchical recurrent networks for crime prediction. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 2018, 1423- 1432 https://doi.org/10.1145/3269206.3271793
22
L Vomfell, W K Härdle, S Lessmann. Improving crime count forecasts using twitter and taxi data. Decision Support Systems, 2018, 113: 73- 85 https://doi.org/10.1016/j.dss.2018.07.003
23
F Yi, Z Yu, F Zhuang, B Guo. Neural network based continuous conditional random field for fine-grained crime prediction. In: Proceedings of International Joint Conferences on Artificial Intelligence. 2019, 4157- 4163
H Wang, D Kifer, C Graif, Z Li. Crime rate inference with big data. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 635- 644 https://doi.org/10.1145/2939672.2939736
26
Z Kang, C Peng, Q Cheng. Top-n recommender system via matrix completion. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 179- 185
27
D Shin, S Cetintas, K C Lee, I S Dhillon. Tumblr blog recommendation with boosted inductive matrix completion. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management. 2015, 203- 212 https://doi.org/10.1145/2806416.2806578
28
E C Chi, H Zhou, G K Chen, D O Del Vecchyo, K Lange. Genotype imputation via matrix completion. Genome Research, 2013, 23 (3): 509- 518 https://doi.org/10.1101/gr.145821.112
29
T Cai, T T Cai, A Zhang. Structured matrix completion with applications to genomic data integration. Journal of the American Statistical Association, 2016, 111 (514): 621- 633 https://doi.org/10.1080/01621459.2015.1021005
P Biswas, T C Lian, T C Wang, Y Ye. Semidefinite programming based algorithms for sensor network localization. ACM Transactions on Sensor Networks (TOSN), 2006, 2 (2): 188- 220 https://doi.org/10.1145/1149283.1149286
32
A Singer, M Cucuringu. Uniqueness of low-rank matrix completion by rigidity theory. SIAM Journal on Matrix Analysis and Applications, 2010, 31 (4): 1621- 1641 https://doi.org/10.1137/090750688
33
P Chen, D Suter. Recovering the missing components in a large noisy lowrank matrix: application to SFM. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26 (8): 1051- 1063 https://doi.org/10.1109/TPAMI.2004.52
34
G Liu, Q Liu, P Li. Blessing of dimensionality: recovering mixture data via dictionary pursuit. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (1): 47- 60 https://doi.org/10.1109/TPAMI.2016.2539946
35
A L Chistov, D Y Grigor’Ev. Complexity of quantifier elimination in the theory of algebraically closed fields. In: Proceedings of International Symposium on Mathematical Foundations of Computer Science. 1984, 17- 31
36
E J Candès, B Recht. Exact matrix completion via convex optimization. Foundations of Computational Mathematics, 2009, 9 (6): 717 https://doi.org/10.1007/s10208-009-9045-5
37
National crime victimization survey. Special Report (No. NCJ 240655).
38
A C Cameron, P K Trivedi. Regression Analysis of Count Data. Cambridge University Press, 2010-2016 (2017)
39
T M Khoshgoftaar, K Gao, R M Szabo. An application of zero-inflated poisson regression for software fault prediction. In: Proceedings of the 12th International Symposium on Software Reliability Engineering. 2001, 66- 73 https://doi.org/10.1109/ISSRE.2001.989459
40
W Gardner, E P Mulvey, E C Shaw. Regression analyses of counts and rates: poisson, overdispersed poisson, and negative binomial models. Psychological Bulletin, 1995, 118 (3): 392 https://doi.org/10.1037/0033-2909.118.3.392
41
D Lambert. Zero-inflated poisson regression, with an application to defects in manufacturing. Technometrics, 1992, 34 (1): 1- 14 https://doi.org/10.2307/1269547
42
D W Osgood. Poisson-based regression analysis of aggregate crime rates. Journal of Quantitative Criminology, 2000, 16 (1): 21- 43 https://doi.org/10.1023/A:1007521427059
43
K Xiao, Q Liu, C Liu, H Xiong. Price shock detection with an influencebased model of social attention. ACM Transactions on Management Information Systems, 2017, 9 (1): 1- 21
44
D L Weisel. Analyzing repeat victimization. US Department of Justice, Office of Community Oriented Policing Services Washington, DC, 2005
45
H F Yu, N Rao, I S Dhillon. Temporal regularized matrix factorization for high-dimensional time series prediction. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 847- 855
46
D J Stekhoven, P Bühlmann. Missfores-non-parametric missing value imputation for mixed-type data. Bioinformatics, 2011, 28 (1): 112- 118
47
L Gondara, K Wang. Multiple imputation using deep denoising autoencoders. 2017, arXiv preprint arXiv:1705.02737
48
J Yoon, J Jordon, v d M. Schaar. Gain: missing data imputation using generative adversarial nets. In: Proceedings of International Conference on Machine Learning. 2018, 5689- 5698
49
J F Cai, E J Candès, Z Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 2010, 20 (4): 1956- 1982 https://doi.org/10.1137/080738970
50
S Ji, J Ye. An accelerated gradient method for trace norm minimization. In: Proceedings of the 26th Annual International Conference on Machine Learning. 2009, 457- 464 https://doi.org/10.1145/1553374.1553434
51
D L Donoho, I M Johnstone, G Kerkyacharian, D Picard. Wavelet shrinkage: asymptopia? Journal of the Royal Statistical Society, Series B (Methodological), 1995, 57 (2): 301- 337 https://doi.org/10.1111/j.2517-6161.1995.tb02032.x
52
M Lichman, P Smyth. Prediction of sparse user-item consumption rates with zero-inflated poisson regression. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web. 2018, 719- 728
53
D M Blei, A Y Ng, M I Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993- 1022
54
G Salton, M J McGill. Introduction to Modern Information Retrieval. McGraw-Hill, Inc., 1986
55
Foursquare. see Foursquare website, 2018
56
I Ehrlich. On the relation between education and crime. National Bureau of Economic Research, 1975
New York City Department of City Planning, U.S. Census Bureau, New York City PUMAS and Community Districts, 2010
59
B Zhou, L Chen, F Zhou, S Li, S Zhao, S K Das, G Pan. Escort: finegrained urban crime risk inference leveraging heterogeneous open data. IEEE Systems Journal, 2021, 15 (3): 4656- 4667 https://doi.org/10.1109/JSYST.2020.3023762
60
T K Moon. The expectation-maximization algorithm. IEEE Signal Processing Magazine, 1996, 13 (6): 47- 60 https://doi.org/10.1109/79.543975
61
D Kingma, J Ba. Adam: a method for stochastic optimization. 2014, arXiv preprint axXiv: 1412.6980
62
OpenStreetMap. Open street map. see Openstreetmap.org website, 2018
63
NYC Taxi and Limousine Commission. NYC Taxi Dataset, 2018
J Zhang, Y Zheng, D Qi. Deep spatio-temporal residual networks for citywide crowd flows prediction. In: Proceedings of the 31st AAAZ Conference on Artificial Intelligence. 2017