Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2018, Vol. 12 Issue (6) : 1241-1254    https://doi.org/10.1007/s11704-016-6061-x
RESEARCH ARTICLE
M-generalization for multipurpose transactional data publication
Xianxian LI1,2, Peipei SUI1, Yan BAI3, Li-E WANG2()
1. School of Computer Science and Engineering, Beihang University, Beijing 100191, China
2. Guangxi Key Lab of Multi-source InformationMining & Security, Guangxi Normal University, Guilin 541004, China
3. Institute of Technology, University of Washington Tacoma, Tacoma WA 98402-3100, USA
 Download: PDF(575 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Transactional data collection and sharing currently face the challenge of how to prevent information leakage and protect data from privacy breaches while maintaining high-quality data utilities. Data anonymization methods such as perturbation, generalization, and suppression have been proposed for privacy protection. However, many of these methods incur excessive information loss and cannot satisfy multipurpose utility requirements. In this paper, we propose a multidimensional generalization method to provide multipurpose optimization when anonymizing transactional data in order to offer better data utility for different applications. Our methodology uses bipartite graphs with generalizing attribute, grouping item and perturbing outlier. Experiments on real-life datasets are performed and show that our solution considerably improves data utility compared to existing algorithms.

Keywords anonymization      generalization      privacy protection      bipartite graph     
Corresponding Author(s): Li-E WANG   
Just Accepted Date: 18 July 2016   Online First Date: 27 November 2017    Issue Date: 04 December 2018
 Cite this article:   
Xianxian LI,Peipei SUI,Yan BAI, et al. M-generalization for multipurpose transactional data publication[J]. Front. Comput. Sci., 2018, 12(6): 1241-1254.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-016-6061-x
https://academic.hep.com.cn/fcs/EN/Y2018/V12/I6/1241
1 Chang C C, Thompson B, Wang H W, Yao D. Towards publishing recommendation data with predictive anonymization. In: Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security. 2010, 24–35
https://doi.org/10.1145/1755688.1755693
2 Zheng Z J, Kohavi R, Mason L. Real world performance of association rule algorithms. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2001, 401–406
https://doi.org/10.1145/502512.502572
3 Wang L E, Li X X. A hybrid optimization approach for anonymizing transactional data. In: Proceedings of International Conference on Algorithms and Architectures for Parallel Processing. 2015, 120–132
https://doi.org/10.1007/978-3-319-27161-3_11
4 Ghinita G, Tao Y F, Kalnis P. On the anonymization of sparse highdimensional data. In: Proceedings of the 24th IEEE International Conference on Data Engineering. 2008, 715–724
5 Terrovitis M, Mamoulis N, Kalnis P. Privacy-preserving anonymization of set-valued data. Proceedings of the VLDB Endowment, 2008, 1(1): 115–125
https://doi.org/10.14778/1453856.1453874
6 Terrovitis M, Mamoulis N, Kalnis P. Local and global recoding methods for anonymizing set-valued data. The VLDB Journal—The International Journal on Very Large Data Bases, 2011, 20(1): 83–106
7 He Y Y, Naughton J F. Anonymization of set-valued data via topdown, local generalization. Proceedings of the VLDB Endowment, 2009, 2(1): 934–945
https://doi.org/10.14778/1687627.1687733
8 Liu J Q, Wang K. Anonymizing transaction data by integrating suppression and generalization. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2010, 171–180
https://doi.org/10.1007/978-3-642-13657-3_20
9 Xu Y B, Wang K, Fu A W C, Yu P S. Anonymizing transaction databases for publication. In: Proceedings of the 14th ACM SIGKDD Nternational Conference on Knowledge Discovery and Data Mining. 2008, 767–775
https://doi.org/10.1145/1401890.1401982
10 Ghinita G, Kalnis P, Tao Y F. Anonymous publication of sensitive transactional data. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(2): 161–174
https://doi.org/10.1109/TKDE.2010.101
11 Chen B, Kifer D, LeFevre K, Machanavajjhala A. Privacy-preserving data publishing. Foundations and Trends in Databases, 2009, 2(1–2): 1–167
12 Fung B C M, Wang K, Chen R, Yu P S. Privacy-preserving data publishing: a survey on recent developments. ACM Computing Surveys (CSUR), 2010, 42(4): 14
https://doi.org/10.1145/1749603.1749605
13 Poulis G, Loukides G, Gkoulalas-Divanis A, Skiadoppoulos S. Anonymizing data with relational and transaction attributes. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2013, 353–369
https://doi.org/10.1007/978-3-642-40994-3_23
14 Takahashi T, Sobataka K, Takenouchi T, Toyoda Y, Mori T, Kohro T. Top-down itemset recoding for releasing private complex data. In: Proceedings of the 11th IEEE Annual International Conference on Privacy, Security and Trust. 2013, 373–376
https://doi.org/10.1109/PST.2013.6596094
15 Gkoulalas-Divanis A, Loukides G. Utility-guided clustering-based transaction data anonymization. Transactions on Data Privacy, 2012, 5(1): 223–251
16 Cormode G, Srivastava D, Yu T, Zhang Q. Anonymizing bipartite graph data using safe groupings. The VLDB Journal—The International Journal on Very Large Data Bases, 2010, 19(1): 115–139
17 Wong W K, Mamoulis N, Cheung D W L. Non-homogeneous generalization in privacy preserving data publishing. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 2010, 747–758
https://doi.org/10.1145/1807167.1807248
18 Samarati P. Protecting respondents’ identities in microdata release. IEEE transactions on Knowledge and Data Engineering, 2001, 13(6): 1010–1027
https://doi.org/10.1109/69.971193
19 Sweeney L. K-anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2002, 10(05): 557–570
https://doi.org/10.1142/S0218488502001648
20 Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M. ldiversity: privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data, 2007, 1(1): 3
https://doi.org/10.1145/1217299.1217302
21 Li N H, Li T C, Venkatasubramanian S. T-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of the 23rd IEEE International Conference on Data Engineering. 2007, 106–115
https://doi.org/10.1109/ICDE.2007.367856
22 Xue M Q, Karras P, Raïssi C, Vaidya J, Tan K L. Anonymizing setvalued data by nonreciprocal recoding. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, 1050–1058
https://doi.org/10.1145/2339530.2339696
23 Cao J N, Karras P, Raïssi C, Tan K L. ρ-uncertainty: inference-proof transaction anonymization. Proceedings of the VLDB Endowment, 2010, 3(1–2): 1033–1044
https://doi.org/10.14778/1920841.1920971
24 Loukides G, Gkoulalas-Divanis A, Shao J H. Anonymizing transaction data to eliminate sensitive inferences. In: Proceedings of International Conference on Database and Expert Systems Applications. 2010, 400–415
https://doi.org/10.1007/978-3-642-15364-8_34
25 Loukides G, Gkoulalas-Divanis A, Shao J H. Efficient and flexible anonymization of transaction data. Knowledge and Information Systems, 2013, 36(1): 153–210
https://doi.org/10.1007/s10115-012-0544-3
26 Zhou J, Jing J W, Xiang J, Wang L. Privacy preserving social network publication on bipartite graphs. In: Proceedings of IFIP International Workshop on Information Security Theory and Practice. 2012, 58–70
https://doi.org/10.1007/978-3-642-30955-7_7
27 Wang L E, Li X X. A clustering-based bipartite graph privacypreserving approach for sharing high-dimensional data. International Journal of Software Engineering and Knowledge Engineering, 2014, 24(07): 1091–1111
https://doi.org/10.1142/S0218194014500363
28 Wang L E, Li X X. Personalized privacy protection for transactional data. In: Proceedings of International Conference on Advanced Data Mining and Applications. 2014, 253–266
https://doi.org/10.1007/978-3-319-14717-8_20
29 Loukides G, Gkoulalas-Divanis A, Malin B. COAT: constraint-based anonymization of transactions. Knowledge and Information Systems, 2011, 28(2): 251–282
https://doi.org/10.1007/s10115-010-0354-4
30 Gionis A, Mazza A, Tassa T. k-Anonymization revisited. In: Proceedings of the 24th IEEE International Conference on Data Engineering. 2008, 744–753
https://doi.org/10.1109/ICDE.2008.4497483
[1] Xingyue CHEN, Tao SHANG, Feng ZHANG, Jianwei LIU, Zhenyu GUAN. Dynamic data auditing scheme for big data storage[J]. Front. Comput. Sci., 2020, 14(1): 219-229.
[2] Xuan LI, Jin LI, Siuming YIU, Chongzhi GAO, Jinbo XIONG. Privacy-preserving edge-assisted image retrieval and classification in IoT[J]. Front. Comput. Sci., 2019, 13(5): 1136-1147.
[3] Xiao PAN,Weizhang CHEN,Lei WU,Chunhui PIAO,Zhaojun HU. Protecting personalized privacy against sensitivity homogeneity attacks over road networks in mobile services[J]. Front. Comput. Sci., 2016, 10(2): 370-386.
[4] Laixiang SHAN,Xiaomin DU,Zheng QIN. Efficient approach of translating LTL formulae into Büchi automata[J]. Front. Comput. Sci., 2015, 9(4): 511-523.
[5] Rong ZHANG, Koji ZETTSU, Yutaka KIDAWARA, Yasushi KIYOKI, Aoying ZHOU. Context-sensitive Web service discovery over the bipartite graph model[J]. Front Comput Sci, 2013, 7(6): 875-893.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed