Personalized query suggestion diversification in information retrieval
Wanyu CHEN1, Fei CAI1(), Honghui CHEN1(), Maarten DE RIJKE2()
1. Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China 2. Informatics Institute, University of Amsterdam, Amsterdam, 1098XH, The Netherlands
Query suggestions help users refine their queries after they input an initial query. Previous work on query suggestion has mainly concentrated on approaches that are similarity-based or context-based, developing models that either focus on adapting to a specific user (personalization) or on diversifying query aspects in order to maximize the probability of the user being satisfied (diversification). We consider the task of generating query suggestions that are both personalized and diversified. We propose a personalized query suggestion diversification (PQSD) model, where a user’s long-term search behavior is injected into a basic greedy query suggestion diversification model that considers a user’s search context in their current session. Query aspects are identified through clicked documents based on the open directory project (ODP) with a latent dirichlet allocation (LDA) topic model. We quantify the improvement of our proposed PQSD model against a state-of-the-art baseline using the public america online (AOL) query log and show that it beats the baseline in terms of metrics used in query suggestion ranking and diversification. The experimental results show that PQSD achieves its best performance when only queries with clicked documents are taken as search context rather than all queries, especially when more query suggestions are returned in the list.
W Y Chen, F Cai, H H Chen, M De Rijke. Personalized query suggestion diversification. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2017, 817–820
2
S Yang, D Y Zhou, L W He. Post-ranking query suggestion by diversifying search results. In: Proceedings of the 34th Annual International ACMSIGIR Conference on Research and Development in Information Retrieval. 2011, 815–824
3
R R Li, B Kao, B Bi, R Cheng, E Lo. DQR: a probabilistic approach to diversified query recommendation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. 2012, 16–25
4
H Ma, MR Lyu, I King. Diversifying query suggestion results. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence. 2010,1399–1404
5
Z Y Zhang, O Nasraoui. Mining search engine query logs for query recommendation. In: Proceedings of the 15th International Conference on World Wide Web. 2006, 1039–1040
6
H H Cao, D X Jiang, J Pei, Q He, Z Liao, E H Chen, H Li. Contextaware query suggestion by mining click-through and session data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 875–883
7
D M Blei, A Y Ng, M I Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 2003, 3(4): 993–1022
8
G Pass, A Chowdhury, C Torgeson. A picture of search. In: Proceedings of the 1st International Conference on Scalable Information Systems. 2006, 1–7
9
F Cai, M De Rijke. A survey of query auto completion in information retrieval. Foundations and Trends in Information Retrieval, 2016, 10(4): 273–363
10
F Cai, S S Liang, M De Rijke. Prefix-adaptive and time-sensitive personalized query auto completion. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(9): 2452–2466
11
F Cai, M De Rijke. Learning from homologous queries and semantically related terms for query auto completion. Information Processing and Management, 2016, 52(4): 628–643
12
R L T Santos, J Peng, C Macdonald, I Ounis. Explicit search result diversification through sub-queries. In: Proceedings of the 32nd European Conference on Information Retrieval. 2010, 87–99
13
S Al-otaibi, M Ykhlef. Hybrid immunizing solution for job recommender system. Frontiers of Computer Science, 2017, 11(3): 511–527
14
E Kharitonov, C Macdonald, P Serdyukov, I Ounis. Intent models for contextualising and diversifying query suggestions. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management. 2013, 2303–2308
15
C N Ziegler, S M McNee, J A Konstan, G Lausen. Improving recommendation lists through topic diversification. In: Proceedings of the 14th International Conference on World Wide Web. 2005, 22–32
16
L Li, Z L Yang, L Liu, M Kitsuregawa. Query-URL bipartite based approach to personalized query recommendation. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence. 2008, 1189–1194
17
S Sharma, N Mangla. Obtaining personalized and accurate query suggestion by using agglomerative clustering algorithm and P-QC method. International Journal of Engineering Research and Technology, 2012, 1(5): 28–35
18
S Verberne, M Sappelli, K Järvelin, W Kraaij. User simulations for interactive search: evaluating personalized query suggestion. In: Proceedings of the 2015 European Conference on Information Retrieval. 2015, 678–690
19
D Vallet, P Castells. Personalized diversification of search results. In: Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2012, 841–850
20
N Craswell, M Szummer. Random walks on the click graph. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2007, 239–246
21
J W Cui, H Y Liu, J Yan, L Ji, RM Jin, J He, Y Q Guo, Z Chen, X Y Du. Multi-view random walk framework for search task discovery from click-through log. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 2011, 135–140
22
H Ma, H X Yang, I King, M R. Lyu. Learning latent semantic relations from clickthrough data for query suggestion. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management. 2008, 709–718
23
Q Z Mei, D Zhou, K Church. Query suggestion using hitting time. In: Proceedings of the 17th ACM International Conference on Information and Knowledge Management. 2008, 469–478
24
S S Liang, F Cai, Z C Ren, M de Rijke. Efficient structured learning for personalized diversification. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(11): 2958–2973
25
C K Huang, L F Chien, Y J Oyang. Relevant term suggestion in interactive web search based on contextual information in query session logs. Journal of the American Society for Information Science and Technology, 2003, 54(7): 638–649
26
T Mikolov, K Chen, G Corrado, J Dean. Efficient estimation of word representations in vector space. In: Proceedings of Workshop at International Conference on Learning Representations. 2013, 1–13
27
F Cai, R Ridho, M De Rijke. Diversifying query auto-completion. ACM Transactions on Information Systems, 2016, 34(4): 1–33
28
T Joachims. Optimizing search engines using clickthrough data. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2002, 133–142
29
D Bollegala, Y Matsuo, M Ishizuka. Measuring semantic similarity between words using Web search engines. In: Proceedings of the 16th International Conference on World Wide Web. 2007, 757–766
30
J Carbonell, J Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 1998, 335–336
31
J F Guo, X Q Cheng, G Xu, X F Zhu. Intent-aware query similarity. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 2011, 259–268
32
C Shah, W B Croft. Evaluating high accuracy retrieval techniques. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2004, 2–9
33
C L A Clarke, M Kolla, G V. Cormack, O Vechtomova, A Ashkan, S Büttcher, I MacKinnon. Novelty and diversity in information retrieval evaluation. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2008, 659–666
34
K Järvelin, J Kekäläinen. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 2002, 20(4): 422–446
35
O Chapelle, D Metzler, Y Zhang, P Grinspan. Expected reciprocal rank for graded relevance. In: Proceedings of the 18th ACM International Conference on Information and Knowledge Management. 2009, 621–630
36
A Asuncion, M Welling, P Smyth, W Y Teh. On smoothing and inference for topic models. In: Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence. 2009, 27–34
37
R Agrawal, S Gollapudi, A Halverson, S Ieong. Diversifying search results. In: Proceedings of the 2009 International Conference on Web Search and Data Mining. 2009, 5–14
38
F Cai, S Q Wang, M De Rijke. Behavior-based personalization in Web search. Journal of the Association for Information Science and Technology, 2017, 68(4): 855–868
39
A Sepliarskai, F Radlinski, M De Rijke. Simple personalized search based on long-term behavioral signals. In: Proceedings of the 39th European Conference on Information Retrieval. 2017, 95–107