Mean estimation over numeric data with personalized local differential privacy
Qiao XUE1, Youwen ZHU1,2,3(), Jian WANG1
1. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China 2. Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin 541004, China 3. School of Cyber Security, Gansu University of Political Science and Law, Lanzhou 730070, China
The fast development of the Internet and mobile devices results in a crowdsensing business model, where individuals (users) are willing to contribute their data to help the institution (data collector) analyze and release useful information. However, the reveal of personal data will bring huge privacy threats to users, which will impede the wide application of the crowdsensing model. To settle the problem, the definition of local differential privacy (LDP) is proposed. Afterwards, to respond to the varied privacy preference of users, researchers propose a new model, i.e., personalized local differential privacy (PLDP), which allow users to specify their own privacy parameters. In this paper, we focus on a basic task of calculating the mean value over a single numeric attribute with PLDP. Based on the previous schemes for mean estimation under LDP, we employ PLDP model to design novel schemes (LAP, DCP, PWP) to provide personalized privacy for each user. We then theoretically analysis the worst-case variance of three proposed schemes and conduct experiments on synthetic and real datasets to evaluate the performance of three methods. The theoretical and experimental results show the optimality of PWP in the low privacy regime and a slight advantage of DCP in the high privacy regime.
S P Kasiviswanathan, H K Lee, K Nissim, S Raskhodnikova. What can we learn privately?. Siam Journal on Computing, 2008, 40( 3): 793– 826
2
Dwork C. Differential privacy. In: Proceedings of International Conference on Automata, Languages and Programming. 2006, 1-12
3
J C Duchi, M I Jordan, M J Wainwright. Minimax optimal procedures for locally private estimation. Journal of the American Statistical Association, 2018, 113( 521): 182– 201 https://doi.org/10.1080/01621459.2017.1389735
4
Wang N, Xiao X, Yang Y, Zhao J, Hui S, Shin H, Shin J, Yu G. Collecting and Analyzing Multidimensional Data with Local Differential Privacy. In: Proceedings of the 35th IEEE Annual International Conference on Data Engineering. 2019, 638-649
5
Chen R, Li H, Qin A K, Kasiviswanathan S P, Jin H. Private spatial data aggregation in the local setting. In: Proceedings of the 32nd IEEE International Conference on Data Engineering. 2016, 289-300
6
Dwork C, McSherry F, Nissim K, Smith A. Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference. 2006, 265-284
7
Y Liu, Q Zhao. E-Voting scheme using secret sharing and K-anonymity. World Wide Web: Internet and Web Information Systems, 2019, 22( 4): 1657– 1667
8
C Xu, J Ren, D Zhang, Y Zhang. Distilling at the edge: a local differential privacy obfuscation framework for IoT data analytics. IEEE Communications Magazine, 2018, 56( 8): 20– 25 https://doi.org/10.1109/MCOM.2018.1701080
9
Y Zhang, H Huang, L Yang, Y Xiang, M Li. Serious challenges and potential solutions for the industrial Internet of Things with edge intelligence. IEEE Network, 2020, 33( 5): 41– 45
10
B Kuang, A Fu, S Yu, G Yang, M Su, Y Zhang. ESDRA: an efficient and secure distributed remote attestation scheme for IoT swarms. IEEE Internet of Things Journal, 2019, 6( 5): 8372– 8383 https://doi.org/10.1109/JIOT.2019.2917223
11
N Li, W Qardaji, S Dong, J Cao. Privbasis: frequent itemset mining with differential privacy. Proceedings of the VLDB Endowment, 2012, 5( 11): 1340– 1351 https://doi.org/10.14778/2350229.2350251
12
S Su, S Xu, X Cheng, Z Li, F Yang. Differentially private frequent itemset mining via transaction splitting. IEEE Transactions on Knowledge Data Engineering, 2015, 27( 7): 1875– 1891 https://doi.org/10.1109/TKDE.2015.2399310
13
Zhu Y, Zhang Y, Li X, Yan H, Li J. Improved collusion-resisting secure nearest neighbor query over encrypted data in cloud. Concurrency and Computation Practice and Eperience, 2019, 31(21): e4681
14
Zhu Y, Li X. Privacy-preserving k-means clustering with local synchronization in peer-to-peer networks. Peer-to-Peer Networking and Applications, 2020, 13(6): 2272−2284
15
A D Sarwate, K Chaudhuri. Signal processing and machine learning with differential privacy: algorithms and challenges for continuous data. IEEE Signal Processing Magazine, 2013, 30( 5): 86– 94 https://doi.org/10.1109/MSP.2013.2259911
16
Ji Z, Lipton Z C, Elkan C. Differential privacy and machine learning: a survey and review. 2014, arXiv preprint, arXiv: 1412.7584
17
Y Zhang, X Xiao, L Yang, Y Xiang, S Zhong. Secure and efficient outsourcing of PCA-dased face recognition. IEEE Transactions on Information Forensics and Security, 2020, 15( 1): 1683– 1695
18
J Song, Y Liu, J Shao, C Tang. A dynamic membership data aggregation (DMDA) protocol for smart grid. IEEE Systems Journal, 2020, 14( 1): 900– 908 https://doi.org/10.1109/JSYST.2019.2912415
19
Chen J, Liu G, Liu Y. Lightweight privacy-preserving raw data publishing scheme. IEEE Transactions on Emerging Topics in Computing, 2020, DOI:
20
Fu A, Yu S, Zhang Y, Wang H, Huang C. NPP: a new privacy-aware public auditing scheme for cloud data sharing with group users. IEEE Transactions on Big Data, 2017, DOI:
21
Erlingsson Ú, Pihur V, Korolova A. Rappor: randomized aggregatable privacy-preserving ordinal response. In: Proceedings of ACM Sigsac Conference on Computer and Communications Security. 2014, 1054-1067
22
Kairouz P, Bonawitz K, Ramage D. Discrete distribution estimation under local privacy. In: Proceedings of International Conference on Machine learning. 2016, 2436−2444
23
M Ye, A Barg. Optimal schemes for discrete distribution estimation under locally differential privacy. IEEE Transactions on Information Theory, 2018, 64( 8): 5662– 5676 https://doi.org/10.1109/TIT.2018.2809790
24
Qin Z, Yang Y, Yu T, Kjalil I, Xiao X, Ren K. Heavy hitter estimation over set-valued data with local differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 2016, 192-203
25
Wang T, Blocki J, Li N, Jha S. Locally differentially private protocols for frequency estimation. In: Proceedings of the 26th USENIX Security Symposium. 2017, 729-745
26
Ye Q, Hu H, Meng X, Zheng H. PrivKV: key-value data collection with local differential privacy. In: Proceedings of IEEE Symposium on Security and Privacy. 2019, 294-308
27
Xue Q, Zhu Y, Wang J. Joint distribution estimation and naive bayes classification under local differential privacy. IEEE Transactions on Emerging Topics in Computing, 2019, DOI:
28
Xue Q, Zhu Y, Wang J, Li X. Distributed set intersection and union with local differential privacy, In: Proceedings of IEEE International Conference on Parallel & Distributed Systems. 2017, 198-205
29
Xue Q, Zhu Y, Wang J, Li X, Zhang J. Locally differentially private distributed algorithms for set intersection and union. Science China Information Sciences, 2021, 64: 219101
30
S L Warner. Randomized response: a survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 1965, 60( 309): 63– 66 https://doi.org/10.1080/01621459.1965.10480775
31
Jorgensen Z, Yu T, Cormode G. Conservative or liberal? Personalized differential privacy. In: Proceedings of the 32nd IEEE International Conference on Data Engineering. 2016, 1023-1034
32
Wang S, Huang L, Tian M, Yang W, Xu H, Guo H. Personalized privacy-preserving data aggregation for histogram estimation. In: Proceedings of 2015 IEEE Global Communications Conference. 2015, 1-6
33
Ye Y, Zhang M, Feng D, Li H, Chi J. Multiple privacy regimes mechanism for local differential privacy. In: Proceedings of International Conference on Database Systems for Advanced Applications. 2019, 247-263
34
Murakami T, Kawamoto Y. Utility-optimized local differential privacy mechanisms for distribution estimation. In: Proceedings of the 28th USENIX Security Symposium. 2019, 1877-1894