|
|
Joint user profiling with hierarchical attention networks |
Xiaojian LIU1,2, Yi ZHU1,2,3, Xindong WU1,2,4( ) |
1. Key Laboratory of Knowledge Engineering with Big Data (Hefei University of Technology), Ministry of Education, Hefei 230009, China 2. School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China 3. School of Information Engineering, Yangzhou University, Yangzhou 225009, China 4. Mininglamp Academy of Sciences, Mininglamp, Beijing 100193, China |
|
|
Abstract User profiling by inferring user personality traits, such as age and gender, plays an increasingly important role in many real-world applications. Most existing methods for user profiling either use only one type of data or ignore handling the noisy information of data. Moreover, they usually consider this problem from only one perspective. In this paper, we propose a joint user profiling model with hierarchical attention networks (JUHA) to learn informative user representations for user profiling. Our JUHA method does user profiling based on both inner-user and inter-user features. We explore inner-user features from user behaviors (e.g., purchased items and posted blogs), and inter-user features from a user-user graph (where similar users could be connected to each other). JUHA learns basic sentence and bag representations from multiple separate sources of data (user behaviors) as the first round of data preparation. In this module, convolutional neural networks (CNNs) are introduced to capture word and sentence features of age and gender while the self-attention mechanism is exploited to weaken the noisy data. Following this, we build another bag which contains a user-user graph. Inter-user features are learned from this bag using propagation information between linked users in the graph. To acquire more robust data, inter-user features and other inner-user bag representations are joined into each sentence in the current bag to learn the final bag representation. Subsequently, all of the bag representations are integrated to lean comprehensive user representation by the self-attention mechanism. Our experimental results demonstrate that our approach outperforms several state-of-the-art methods and improves prediction performance.
|
Keywords
user profiling
hierarchical attention
joint learning
inner-user feature
inter-user feature
|
Corresponding Author(s):
Xindong WU
|
About author: Tongcan Cui and Yizhe Hou contributed equally to this work. |
Just Accepted Date: 15 April 2022
Issue Date: 09 October 2022
|
|
1 |
A, Culotta N K, Ravi J Cutler . Predicting Twitter user demographics using distant supervision from website traffic data. Journal of Artificial Intelligence Research, 2016, 55( 1): 389– 408
|
2 |
J, Hu H J, Zeng H, Li C, Niu Z Chen. Demographic prediction based on user’s browsing behavior. In: Proceedings of the 16th International Conference on World Wide Web. 2007, 151– 160
|
3 |
J J C, Ying Y J, Chang C M, Huang V S Tseng. Demographic prediction based on user’s mobile behaviors. In: Proceedings of the 16th International Conference on World Wide Web. 2012, 1– 4
|
4 |
Z, Lu S J, Pan Y, Li J, Jiang Q Yang. Collaborative evolution for user profiling in recommender systems. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016, 3804– 3810
|
5 |
S, Chen C, Li F, Ji W, Zhou H Chen. Review-driven answer generation for product-related questions in e-commerce. In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining. 2019, 411– 419
|
6 |
L, Wu C, Quan C, Li Q, Wang B Zheng . A context-aware user-item representation learning for item recommendation. ACM Transactions on Information Systems, 2019, 37( 2): 22
|
7 |
W, Chen Y, Gu Z, Ren X, He H, Xie T, Guo D, Yin Y Zhang. Semi-supervised user profiling with heterogeneous graph attention networks. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2019, 2116– 2122
|
8 |
Y, Dong Y, Yang J, Tang Y, Yang N V Chawla. Inferring user demographics and social strategies in mobile social networks. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 15– 24
|
9 |
Y, Miura M, Taniguchi T, Taniguchi T Ohkuma. Unifying text, metadata, and user network representations with a neural network for geolocation prediction. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 1260– 1272
|
10 |
C, Wu F, Wu J, Liu S, He Y, Huang X Xie. Neural demographic prediction using search query. In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining. 2019, 654– 662
|
11 |
G, Farnadi J, Tang Cock M, De M F Moens. User profiling through deep multimodal fusion. In: Proceedings of the 11th ACM International Conference on Web Search and Data Mining. 2018, 171– 179
|
12 |
Y, Gu Z, Ding S, Wang D Yin. Hierarchical user profiling for e-commerce recommender systems. In: Proceedings of the 13th International Conference on Web Search and Data Mining. 2020, 223– 231
|
13 |
M, Heidari J H, Jones O Uzuner. Deep contextualized word embedding for text-based online user profiling to detect social bots on twitter. In: Proceedings of 2020 International Conference on Data Mining Workshops. 2020, 480– 487
|
14 |
K Filippova. User demographics and language in an implicit social network. In: Proceedings of 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2012, 1478– 1488
|
15 |
W, Li M Dickinson. Gender prediction for Chinese social media data. In: Proceedings of International Conference on Recent Advances in Natural Language Processing. 2017, 438– 445
|
16 |
C, Peersman W, Daelemans Vaerenbergh L Van. Predicting age and gender in online social networks. In: Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents. 2011, 37– 44
|
17 |
Zamal F, Al W, Liu D Ruths. Homophily and latent attribute inference: inferring latent attributes of twitter users from neighbors. In: Proceedings of the 6th International Conference on Weblogs and Social Media. 2012, 387– 390
|
18 |
S, Liang X, Zhang Z, Ren E Kanoulas. Dynamic embeddings for user profiling in twitter. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018, 1764– 1773
|
19 |
S, Rosenthal K McKeown. Age prediction in blogs: a study of style, content, and online behavior in pre-and post-social media generations. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011, 763– 772
|
20 |
M, Kosinski D, Stillwell T Graepel . Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences of the United States of America, 2013, 110( 15): 5802– 5805
|
21 |
M, McPherson L, Smith-Lovin J M Cook . Birds of a feather: homophily in social networks. Annual Review of Sociology, 2001, 27: 415– 444
|
22 |
G, Farnadi Z, Mahdavifar I, Keller J, Nelson A, Teredesai M F, Moens Cock M De. Scalable adaptive label propagation in Grappa. In: Proceedings of 2015 IEEE International Conference on Big Data. 2015, 1485– 1491
|
23 |
R, Rothe R, Timofte Gool L Van. DEX: deep expectation of apparent age from a single image. In: Proceedings of 2015 IEEE International Conference on Computer Vision Workshop. 2015, 252– 257
|
24 |
L, Liu D, Preotiuc-Pietro Z R, Samani M E, Moghaddam L Ungar. Analyzing personality through social media profile picture choice. In: Proceedings of the 10th International AAAI Conference on Web and Social Media. 2016, 211– 220
|
25 |
J I, Biel D Gatica-Perez . The YouTube lens: crowdsourced personality impressions and audiovisual analysis of vlogs. IEEE Transactions on Multimedia, 2013, 15( 1): 41– 55
|
26 |
D, Nguyen N A, Smith C Rosé. Author age prediction from text using linear regression. In: Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. 2011, 115– 123
|
27 |
D, Kosmajac V Keselj. Twitter user profiling: bot and gender identification. In: Proceedings of the 11th International Conference of the Cross-Language Evaluation Forum for European Languages. 2020, 141– 153
|
28 |
Y, Zhu X, Hu Y, Zhang P Li . Transfer learning with stacked reconstruction independent component analysis. Knowledge-Based Systems, 2018, 152: 100– 106
|
29 |
Y, Zhu X, Wu P, Li Y, Zhang X Hu . Transfer learning with deep manifold regularized auto-encoders. Neurocomputing, 2019, 369: 145– 154
|
30 |
L, Wang Q, Li X, Chen S Li. Multi-task learning for gender and age prediction on Chinese microblog. In: Proceedings of the 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and 24th International Conference on Computer Processing of Oriental Languages. 2016, 189– 200
|
31 |
D, Zhang S, Li H, Wang G Zhou. User classification with multiple textual perspectives. In: Proceedings of the 26th International Conference on Computational Linguistics. 2016, 2112– 2121
|
32 |
W, Lin H, Xu J, Li Z, Wu Z, Hu V, Chang J Z Wang. Deep-profiling: a deep neural network model for scholarly Web user profiling. Cluster Computing, 2021, doi:
https://doi.org/10.1007/s10586–021-03315–2
|
33 |
L, Li K, Hu Y, Zheng J, Liu K A Lee. COOPNet: multi-modal cooperative gender prediction in social media user profiling. In: Proceedings of 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. 2021, 4310– 4314
|
34 |
G, Farnadi G, Sitaraman S, Sushmita F, Celli M, Kosinski D, Stillwell S, Davalos M F, Moens Cock M De . Computational personality recognition in social media. User modeling and User-Adapted Interaction, 2016, 26( 2): 109– 142
|
35 |
Y, Zhang Q Yang. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, 2021, doi:
https://doi.org/10.1109/TKDE.2021.3070203
|
36 |
Z, Geng Y, Zhang Y Han . Joint entity and relation extraction model based on rich semantics. Neurocomputing, 2021, 429: 132– 140
|
37 |
Y, Hong Y, Liu S, Yang K, Zhang J Hu . Joint extraction of entities and relations using graph convolution over pruned dependency trees. Neurocomputing, 2020, 411: 302– 312
|
38 |
T N, Kipf M Welling. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. 2016
|
39 |
A, Rahimi T, Cohn T Baldwin. Semi-supervised user geolocation via graph convolutional networks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018, 2009– 2019
|
40 |
P, Veličković G, Cucurull A, Casanova A, Romero P, Liò Y Bengio. Graph attention networks. 2017, arXiv preprint arXiv: 1710.10903
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|