Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2023, Vol. 17 Issue (3) : 173608    https://doi.org/10.1007/s11704-022-1437-6
RESEARCH ARTICLE
Joint user profiling with hierarchical attention networks
Xiaojian LIU1,2, Yi ZHU1,2,3, Xindong WU1,2,4()
1. Key Laboratory of Knowledge Engineering with Big Data (Hefei University of Technology), Ministry of Education, Hefei 230009, China
2. School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China
3. School of Information Engineering, Yangzhou University, Yangzhou 225009, China
4. Mininglamp Academy of Sciences, Mininglamp, Beijing 100193, China
 Download: PDF(3227 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

User profiling by inferring user personality traits, such as age and gender, plays an increasingly important role in many real-world applications. Most existing methods for user profiling either use only one type of data or ignore handling the noisy information of data. Moreover, they usually consider this problem from only one perspective. In this paper, we propose a joint user profiling model with hierarchical attention networks (JUHA) to learn informative user representations for user profiling. Our JUHA method does user profiling based on both inner-user and inter-user features. We explore inner-user features from user behaviors (e.g., purchased items and posted blogs), and inter-user features from a user-user graph (where similar users could be connected to each other). JUHA learns basic sentence and bag representations from multiple separate sources of data (user behaviors) as the first round of data preparation. In this module, convolutional neural networks (CNNs) are introduced to capture word and sentence features of age and gender while the self-attention mechanism is exploited to weaken the noisy data. Following this, we build another bag which contains a user-user graph. Inter-user features are learned from this bag using propagation information between linked users in the graph. To acquire more robust data, inter-user features and other inner-user bag representations are joined into each sentence in the current bag to learn the final bag representation. Subsequently, all of the bag representations are integrated to lean comprehensive user representation by the self-attention mechanism. Our experimental results demonstrate that our approach outperforms several state-of-the-art methods and improves prediction performance.

Keywords user profiling      hierarchical attention      joint learning      inner-user feature      inter-user feature     
Corresponding Author(s): Xindong WU   
About author:

Tongcan Cui and Yizhe Hou contributed equally to this work.

Just Accepted Date: 15 April 2022   Issue Date: 09 October 2022
 Cite this article:   
Xiaojian LIU,Yi ZHU,Xindong WU. Joint user profiling with hierarchical attention networks[J]. Front. Comput. Sci., 2023, 17(3): 173608.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-022-1437-6
https://academic.hep.com.cn/fcs/EN/Y2023/V17/I3/173608
Fig.1  An example of user’s order and click items
Fig.2  Overall framework of JUHA
Fig.3  Word module
Fig.4  Sentence module
Gender Age
Male Female <26 26?35 36?55 >55
25869 14971 7071 20278 12705 786
Tab.1  Details of the JD e-commerce dataset
Gender Age
Male Female <26 26?35 36?55 >55
3412 1,055 2,552 1,651 162 102
Tab.2  Details of the Sina Weibo dataset
Methods Age Gender
Accuracy Macro-F1 Accuracy Macro-F1
GCN 0.491 0.183 0.644 0.510
GAT 0.490 0.178 0.663 0.579
HGCN 0.491 0.174 0.646 0.513
HGAT 0.492 0.187 0.691 0.619
HURA_C 0.523 0.348 0.802 0.787
HURA_O 0.515 0.320 0.747 0.732
COOP_C 0.546 0.355 0.790 0.780
COOP_O 0.502 0.316 0.737 0.720
JUHA 0.580 0.393 0.826 0.809
Tab.3  Experimental results for user profiling on the JD e-commerce dataset. HURA_C and COOP_C are implemented with click bag. HURA_O and COOP_O are implemented with order bag
Methods Age Gender
Accuracy Macro-F1 Accuracy Macro-F1
GCN 0.558 0.266 0.743 0.561
GAT 0.560 0.260 0.761 0.591
HGCN 0.563 0.269 0.766 0.612
HGAT 0.566 0.275 0.771 0.630
HURA 0.631 0.293 0.827 0.728
COOP 0.616 0.300 0.839 0.758
JUHA 0.639 0.322 0.837 0.743
Tab.4  Experimental results for user profiling on the Sina Weibo dataset
Fig.5  Impact of joint learning on the JD dataset. (a) Joint learning for age prediction on JD dataset; (b) joint learning for gender prediction on JD dataset
Fig.6  Impact of hierarchical attention on the JD dataset. (a) Hierarchical attention for age prediction on JD dataset; (b) hierarchical attention for gender prediction on JD dataset
Fig.7  Impact of hierarchical attention on the Sina Weibo Dataset. (a) Hierarchical attention for age prediction on sina Weibo dataset; (b) hierarchical attention for gender prediction on sina Weibo dataset
  
  
  
1 A, Culotta N K, Ravi J Cutler . Predicting Twitter user demographics using distant supervision from website traffic data. Journal of Artificial Intelligence Research, 2016, 55( 1): 389– 408
2 J, Hu H J, Zeng H, Li C, Niu Z Chen. Demographic prediction based on user’s browsing behavior. In: Proceedings of the 16th International Conference on World Wide Web. 2007, 151– 160
3 J J C, Ying Y J, Chang C M, Huang V S Tseng. Demographic prediction based on user’s mobile behaviors. In: Proceedings of the 16th International Conference on World Wide Web. 2012, 1– 4
4 Z, Lu S J, Pan Y, Li J, Jiang Q Yang. Collaborative evolution for user profiling in recommender systems. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016, 3804– 3810
5 S, Chen C, Li F, Ji W, Zhou H Chen. Review-driven answer generation for product-related questions in e-commerce. In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining. 2019, 411– 419
6 L, Wu C, Quan C, Li Q, Wang B Zheng . A context-aware user-item representation learning for item recommendation. ACM Transactions on Information Systems, 2019, 37( 2): 22
7 W, Chen Y, Gu Z, Ren X, He H, Xie T, Guo D, Yin Y Zhang. Semi-supervised user profiling with heterogeneous graph attention networks. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2019, 2116– 2122
8 Y, Dong Y, Yang J, Tang Y, Yang N V Chawla. Inferring user demographics and social strategies in mobile social networks. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 15– 24
9 Y, Miura M, Taniguchi T, Taniguchi T Ohkuma. Unifying text, metadata, and user network representations with a neural network for geolocation prediction. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 1260– 1272
10 C, Wu F, Wu J, Liu S, He Y, Huang X Xie. Neural demographic prediction using search query. In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining. 2019, 654– 662
11 G, Farnadi J, Tang Cock M, De M F Moens. User profiling through deep multimodal fusion. In: Proceedings of the 11th ACM International Conference on Web Search and Data Mining. 2018, 171– 179
12 Y, Gu Z, Ding S, Wang D Yin. Hierarchical user profiling for e-commerce recommender systems. In: Proceedings of the 13th International Conference on Web Search and Data Mining. 2020, 223– 231
13 M, Heidari J H, Jones O Uzuner. Deep contextualized word embedding for text-based online user profiling to detect social bots on twitter. In: Proceedings of 2020 International Conference on Data Mining Workshops. 2020, 480– 487
14 K Filippova. User demographics and language in an implicit social network. In: Proceedings of 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2012, 1478– 1488
15 W, Li M Dickinson. Gender prediction for Chinese social media data. In: Proceedings of International Conference on Recent Advances in Natural Language Processing. 2017, 438– 445
16 C, Peersman W, Daelemans Vaerenbergh L Van. Predicting age and gender in online social networks. In: Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents. 2011, 37– 44
17 Zamal F, Al W, Liu D Ruths. Homophily and latent attribute inference: inferring latent attributes of twitter users from neighbors. In: Proceedings of the 6th International Conference on Weblogs and Social Media. 2012, 387– 390
18 S, Liang X, Zhang Z, Ren E Kanoulas. Dynamic embeddings for user profiling in twitter. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018, 1764– 1773
19 S, Rosenthal K McKeown. Age prediction in blogs: a study of style, content, and online behavior in pre-and post-social media generations. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011, 763– 772
20 M, Kosinski D, Stillwell T Graepel . Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences of the United States of America, 2013, 110( 15): 5802– 5805
21 M, McPherson L, Smith-Lovin J M Cook . Birds of a feather: homophily in social networks. Annual Review of Sociology, 2001, 27: 415– 444
22 G, Farnadi Z, Mahdavifar I, Keller J, Nelson A, Teredesai M F, Moens Cock M De. Scalable adaptive label propagation in Grappa. In: Proceedings of 2015 IEEE International Conference on Big Data. 2015, 1485– 1491
23 R, Rothe R, Timofte Gool L Van. DEX: deep expectation of apparent age from a single image. In: Proceedings of 2015 IEEE International Conference on Computer Vision Workshop. 2015, 252– 257
24 L, Liu D, Preotiuc-Pietro Z R, Samani M E, Moghaddam L Ungar. Analyzing personality through social media profile picture choice. In: Proceedings of the 10th International AAAI Conference on Web and Social Media. 2016, 211– 220
25 J I, Biel D Gatica-Perez . The YouTube lens: crowdsourced personality impressions and audiovisual analysis of vlogs. IEEE Transactions on Multimedia, 2013, 15( 1): 41– 55
26 D, Nguyen N A, Smith C Rosé. Author age prediction from text using linear regression. In: Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. 2011, 115– 123
27 D, Kosmajac V Keselj. Twitter user profiling: bot and gender identification. In: Proceedings of the 11th International Conference of the Cross-Language Evaluation Forum for European Languages. 2020, 141– 153
28 Y, Zhu X, Hu Y, Zhang P Li . Transfer learning with stacked reconstruction independent component analysis. Knowledge-Based Systems, 2018, 152: 100– 106
29 Y, Zhu X, Wu P, Li Y, Zhang X Hu . Transfer learning with deep manifold regularized auto-encoders. Neurocomputing, 2019, 369: 145– 154
30 L, Wang Q, Li X, Chen S Li. Multi-task learning for gender and age prediction on Chinese microblog. In: Proceedings of the 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and 24th International Conference on Computer Processing of Oriental Languages. 2016, 189– 200
31 D, Zhang S, Li H, Wang G Zhou. User classification with multiple textual perspectives. In: Proceedings of the 26th International Conference on Computational Linguistics. 2016, 2112– 2121
32 W, Lin H, Xu J, Li Z, Wu Z, Hu V, Chang J Z Wang. Deep-profiling: a deep neural network model for scholarly Web user profiling. Cluster Computing, 2021, doi:
https://doi.org/10.1007/s10586–021-03315–2
33 L, Li K, Hu Y, Zheng J, Liu K A Lee. COOPNet: multi-modal cooperative gender prediction in social media user profiling. In: Proceedings of 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. 2021, 4310– 4314
34 G, Farnadi G, Sitaraman S, Sushmita F, Celli M, Kosinski D, Stillwell S, Davalos M F, Moens Cock M De . Computational personality recognition in social media. User modeling and User-Adapted Interaction, 2016, 26( 2): 109– 142
35 Y, Zhang Q Yang. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, 2021, doi:
https://doi.org/10.1109/TKDE.2021.3070203
36 Z, Geng Y, Zhang Y Han . Joint entity and relation extraction model based on rich semantics. Neurocomputing, 2021, 429: 132– 140
37 Y, Hong Y, Liu S, Yang K, Zhang J Hu . Joint extraction of entities and relations using graph convolution over pruned dependency trees. Neurocomputing, 2020, 411: 302– 312
38 T N, Kipf M Welling. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. 2016
39 A, Rahimi T, Cohn T Baldwin. Semi-supervised user geolocation via graph convolutional networks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018, 2009– 2019
40 P, Veličković G, Cucurull A, Casanova A, Romero P, Liò Y Bengio. Graph attention networks. 2017, arXiv preprint arXiv: 1710.10903
[1] FCS-21437-OF-XL_suppl_1 Download
[1] Yu ZHU, Zhonglin YE, Haixing ZHAO, Ke ZHANG. Text-enhanced network representation learning[J]. Front. Comput. Sci., 2020, 14(6): 146322-.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed