Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2018, Vol. 12 Issue (3) : 571-581    https://doi.org/10.1007/s11704-016-6078-1
RESEARCH ARTICLE
Exploit latent Dirichlet allocation for collaborative filtering
Zhoujun LI1, Haijun ZHANG1,2(), Senzhang WANG3,4, Feiran HUANG1, Zhenping LI2, Jianshe ZHOU5
1. State Key Laboratory of Software Development Environment, Beihang University, Beijing 100191, China
2. School of Information, Beijing Wuzi University, Beijing 101149, China
3. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
4. Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 211106, China
5. Beijing Advanced Innovation Center for Imaging Technology, Capital Normal University, Beijing 100048, China
 Download: PDF(521 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Previous work on the one-class collaborative filtering (OCCF) problem can be roughly categorized into pointwise methods, pairwise methods, and content-based methods. A fundamental assumption of these approaches is that all missing values in the user-item rating matrix are considered negative. However, this assumption may not hold because the missing values may contain negative and positive examples. For example, a user who fails to give positive feedback about an item may not necessarily dislike it; he may simply be unfamiliar with it. Meanwhile, content-based methods, e.g. collaborative topic regression (CTR), usually require textual content information of the items, and thus their applicability is largely limited when the text information is not available. In this paper, we propose to apply the latent Dirichlet allocation (LDA) model on OCCF to address the above-mentioned problems. The basic idea of this approach is that items are regarded as words, users are considered as documents, and the user-item feedback matrix constitutes the corpus. Our model drops the strong assumption that missing values are all negative and only utilizes the observed data to predict a user’s interest. Additionally, the proposed model does not need content information of the items. Experimental results indicate that the proposed method outperforms previous methods on various ranking-oriented evaluation metrics. We further combine this method with a matrix factorizationbased method to tackle the multi-class collaborative filtering (MCCF) problem, which also achieves better performance on predicting user ratings.

Keywords latent Dirichlet allocation      one-class collaborative filtering      multi-class collaborative filtering     
Corresponding Author(s): Haijun ZHANG   
Just Accepted Date: 05 September 2016   Online First Date: 12 December 2017    Issue Date: 02 May 2018
 Cite this article:   
Zhoujun LI,Haijun ZHANG,Senzhang WANG, et al. Exploit latent Dirichlet allocation for collaborative filtering[J]. Front. Comput. Sci., 2018, 12(3): 571-581.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-016-6078-1
https://academic.hep.com.cn/fcs/EN/Y2018/V12/I3/571
1 Pan W K, Chen L. GBPR: group preference based Bayesian personalized ranking for one-class collaborative filtering. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence. 2013, 2691–2697
2 Pan R, Zhou Y H, Cao B, Liu N N, Lukose R, Scholz M, Yang Q. One-class collaborative filtering. In: Proceedings of the 8th IEEE International Conference on Data Mining. 2008, 502–511
https://doi.org/10.1109/ICDM.2008.16
3 Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L. BPR: Bayesian personalized ranking from implicit feedback. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. 2009, 452–461
4 Li Y, Hu J, Zhai C X, Chen Y. Improving one-class collaborative filtering by incorporating rich user information. In: Proceedings of the 19th ACM Conference on Information and Knowledge Management. 2010, 959–968
https://doi.org/10.1145/1871437.1871559
5 Hu Y F, Koren Y, Volinsky C. Collaborative filtering for implicit feedback datasets. In: Proceedings of the 8th IEEE International Conference on Data Mining. 2008, 263–272
https://doi.org/10.1109/ICDM.2008.22
6 Zhang H J, Li Z J, Chen Y, Zhang X M, Wang S Z. Exploit latent Dirichlet allocation for one-class collaborative filtering. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management. 2014, 1991–1994
https://doi.org/10.1145/2661829.2661992
7 Li B, Yang Q, Xue X Y. Can movies and books collaborate? Crossdomain collaborative filtering for sparsity reduction. In: Proceedings of the International Joint Conference on Artificial Intelligence. 2009, 2052–2057
8 Hofmann T. Latent semantic models for collaborative filtering. ACM Transactions on Information Systems, 2004, 22(1): 89–115
https://doi.org/10.1145/963770.963774
9 Salakhutdinov R, Mnih A, Hinton G. Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th International Conference on Machine Learning. 2007, 791–798
https://doi.org/10.1145/1273496.1273596
10 Zhang H J, Liu C Y, Li Z J, Zhang X M. Collaborative filtering based on rating psychology. In: Proceedings of International Conference on Web-Age Information Management. 2013, 655–665
https://doi.org/10.1007/978-3-642-38562-9_67
11 Gu B, Sheng V S, Tay K Y, Romano W, Li S. Incremental support vector learning for ordinal regression. IEEE Transactions on Neural Networks and Learning Systems, 2014, 26(7): 1403–1416
https://doi.org/10.1109/TNNLS.2014.2342533
12 Gu B, Sun X M, Sheng V S. Structural minimax probability machine. IEEE Transactions on Neural Networks and Learning Systems, 2016, 28(7): 1646–1656
https://doi.org/10.1109/TNNLS.2016.2544779
13 Ma T H, Zhou J J, Tang M L, Tian Y, Al-Dhelaan A, Al-Rodhaan M, Lee S. Social network and tag sources based augmenting collaborative recommender system. IEICE Transactions on Information and Systems, 2015, E98-D(4): 902–910
https://doi.org/10.1587/transinf.2014EDP7283
14 Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer, 2009, 42(8): 30–37
https://doi.org/10.1109/MC.2009.263
15 Ma H, Yang H X, Lyu M R, King I. SoRec: social recommendation using probabilistic matrix factorization. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management. 2008, 931–940
https://doi.org/10.1145/1458082.1458205
16 Funk S. Netflix update: try this at home. Blog Post Sifter. 2006
17 Srebro N, Jaakkola T. Weighted low-rank approximations. In: Proceedings of the 20th International Conference on Machine Learning. 2003, 720–727
18 Cremonesi P, Koren Y, Turrin R. Performance of recommender algorithms on top-n recommendation tasks. In: Proceedings of the 4th ACM Conference on Recommender Systems. 2010, 39–46
https://doi.org/10.1145/1864708.1864721
19 Mnih A, Salakhutdinov R R. Probabilistic matrix factorization. In: Proceedings of International Conference on Machine Learning. 2012, 880–887
20 He H B, Garcia E A. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9): 1263–1284
https://doi.org/10.1109/TKDE.2008.239
21 Wang C, Blei D M. Collaborative topic modeling for recommending scientific articles. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2011, 448–456
https://doi.org/10.1145/2020408.2020480
22 Purushotham S, Liu Y, Kuo C C J. Collaborative topic regression with social matrix factorization for recommendation systems. Computer Science, 2012
23 Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993–1022
24 Chen Y, Yin X S, Li Z J, Hu X H, Huang J X. A LDA-based approach to promoting ranking diversity for genomics information retrieval. BMC Genomics, 2012, 13(Suppl 3): 104–111
25 Wilson J, Chaudhury S, Lall B. Improving collaborative filtering based recommenders using topic modelling. In: Proceedings of IEEE/WIC/ACM International Joint Conference on Artificial Intelligence. 2014, 340–346
https://doi.org/10.1109/WI-IAT.2014.54
26 Hofmann T. Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 1999, 56–73
https://doi.org/10.1145/312624.312649
27 Heinrich G. Parameter estimation for text analysis. Technical Report. 2005
28 Gantner Z, Rendle S, Freudenthaler C, Schmidt-Thieme L. MyMediaLite: a free recommender system library. In: Proceedings of the 5th ACM Conference on Recommender Systems. 2011, 305–308
https://doi.org/10.1145/2043932.2043989
29 Newman D, Asuncion A U, Smyth P, Asuncion A U. Distributed inference for latent Dirichlet allocation. In: Proceedings of Conference on Neural Information Processing Systems. 2007, 1–6
30 Wang Y, Bai H, Stanton M, Chen W Y, Chang E Y. PLDA: parallel latent Dirichlet allocation for large-scale applications. In: Proceedings of International Conference on Algorithmic Aspects in Information and Management. 2009, 301–314
https://doi.org/10.1007/978-3-642-02158-9_26
31 Magnusson M, Jonsson L, Villani M, Broman D. Parallelizing LDA using partially collapsed Gibbs sampling. Statistics, 2015
[1] Hao WU, Yijian PEI, Jiang YU, . Detecting academic experts by topic-sensitive link analysis[J]. Front. Comput. Sci., 2009, 3(4): 445-456.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed