Generating timeline summaries with social media attention
Wayne Xin ZHAO1,4,*(),Ji-Rong WEN1,2,Xiaoming LI3
1. School of Information, Renmin University of China, Beijing 100872, China 2. Beijing Key Laboratory of Big Data Management and Analysis Methods, Renmin University of China, Beijing 100872, China 3. School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China 4. Beijing Key Laboratory on Integration and Analysis of Large-scale Stream Data, Beijing 100144, China
Timeline generation is an important research task which can help users to have a quick understanding of the overall evolution of one given topic. Previous methods simply split the time span into fixed, equal time intervals without studying the role of the evolutionary patterns of the underlying topic in timeline generation. In addition, few of these methods take users’ collective interests into considerations to generate timelines.
We consider utilizing social media attention to address these two problems due to the facts: 1) social media is an important pool of real users’ collective interests; 2) the information cascades generated in it might be good indicators for boundaries of topic phases. Employing Twitter as a basis, we propose to incorporate topic phases and user’s collective interests which are learnt from social media into a unified timeline generation algorithm.We construct both one informativeness-oriented and three interestingness-oriented evaluation sets over five topics.We demonstrate that it is very effective to generate both informative and interesting timelines. In addition, our idea naturally leads to a novel presentation of timelines, i.e., phase based timelines, which can potentially improve user experience.
Swan R, Allan J. Automatic generation of overview timelines. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2000, 49–56
https://doi.org/10.1145/345508.345546
2
Chieu H L, Lee Y K. Query based event extraction along a timeline. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2004, 425–432
https://doi.org/10.1145/1008992.1009065
3
Yan R, Wan X J, Otterbacher J, Kong L, Li X M, Zhang Y. Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In: Proceedings of the 34th Annual International ACMSIGIR Conference on Research and Development in Information Retrieval. 2011, 745–754
https://doi.org/10.1145/2009916.2010016
4
Yan R, Kong L, Huang C R, Wan X J, Li X M, Zhang Y. Timeline generation through evolutionary trans-temporal summarization. In: Proceedings of the Conference on EmpiricalMethods in Natural Language Processing. 2011, 433–443
5
Yan R, Nie J Y, Li X M. Summarize what you are interested in: an optimization framework for interactive personalized summarization. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2011, 1342–1351
Wu S M, Hofman J M, Maso n W A, Watts J D. Who says what to whom on twitter. In: Proceedings of the 20th International World Wide Web Conference. 2011, 705–714
https://doi.org/10.1145/1963405.1963504
8
Zhai C X, Lafferty J. Model-based feedback in the language modeling approach to information retrieval. In: Proceedings of the 10th ACM International Conference on Information and Knowledge Management. 2001, 403–410
https://doi.org/10.1145/502585.502654
9
Erkan G, Radev D R. LexPageRank: prestige in multi-document text summarization. In: Proceedings of the Conference on Empirical Methods on Natural Language Processing. 2004, 365–371
10
Wan X J, Yang J W, Xiao J G. Manifold-ranking based topic-focused multi-document summarization. In: Proceedings of the International Joint Conference on Artificial Intelligence. 2007, 2903–2908
11
Mei Q Z, Guo J, Radev , D. DivRank: the interplay of prestige and diversity in information networks. In: Proceedings of the 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2010, 1009–1018
https://doi.org/10.1145/1835804.1835931
12
Yang J, Leskovec J. Patterns of temporal variation in online media. In: Proceedings of the 4th ACM International Conference on Web Search and Data Mining. 2011, 177–186
https://doi.org/10.1145/1935826.1935863
13
Leskovec J, Backstrom L, Kleinberg , J. Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD Conference on Knowledge Discovery and DataMining. 2009, 497–506
https://doi.org/10.1145/1557019.1557077
14
Radev D R, Jing H Y, Sty M, Tam D. Centroid-based summarization of multiple documents. Information Processing and Management, 2004, 40(6): 919–938
https://doi.org/10.1016/j.ipm.2003.10.006
15
Wan X J, Yang J W. Multi-document summarization using clusterbased link analysis. In: Proceedings of the 31st ACM SIGIR Conference on Research and Development in Information Retrieval. 2008, 299–306
https://doi.org/10.1145/1390334.1390386
16
Zhao X W, Shu B H, Jiang J, Song Y, Yan H F, Li X M. Identifying event-related bursts via social media activities. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2012, 1466–1477
17
Lin C Y, Hovy E. From single to multi-document summarization: a prototype system and its evaluation. In: Proceedings of the 40th Annual Conference of the Association for Computational Linguistics. 2002, 457–464
18
Wan X J, Yang J W, Xiao J G. Single document summarization with document expansion. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2007, 931–936
19
Li L D, Zhou K, Xue G R, Zha H Y, Yu Y. Enhancing diversity, coverage and balance for summarization through structure learning. In: Proceedings of the 18th International World Wide Web Conference. 2009, 71–80
https://doi.org/10.1145/1526709.1526720
20
Goldstein J, Kantrowitz M, Mittal V, Carbonell J. Summarizing text documents: sentence selection and evaluation metrics. In: Proceedings of the 22nd ACM SIGIR Conference on Research and Development in Information Retrieval. 1999, 121–128
https://doi.org/10.1145/312624.312665
21
Leuski A, Lin C Y, Hovy E. iNeATS: interactive multi-document summarization. In: Proceedings of the 41st Conference of the Association for Computational Linguistics. 2003, 125–128
https://doi.org/10.3115/1075178.1075197
22
Allan J, Gupta R, Khandelwal V. Temporal summaries of new topics. In: Proceedings of the 24th ACM SIGIR Conference on Research and Development in Information Retrieval. 2001, 10–18
https://doi.org/10.1145/383952.383954
23
Yang Z, Cai K K, Tang J, Zhang L, Su Z, Li J Z. Social context summarization. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2011, 255–264
https://doi.org/10.1145/2009916.2009954
24
Swan R, Allan J. Automatic generation of overview timelines. In: Proceedings of the 23rd Annual ACM SIGIR Conference on Research and Development in Information Retrieval. 2000, 49–56
https://doi.org/10.1145/345508.345546
25
Fung G P C, Yu J X, Yu P S, Lu H J. Parameter free bursty events detection in text streams. In: Proceedings of the 31st International Conference on Very Large Data Bases. 2005, 181–192
26
Mathioudakis M, Koudas N. TwitterMonitor: trend detection over the twitter stream. In: Proceedings of the 2010 ACM International Conference on Management of Data. 2010, 1155–1158
https://doi.org/10.1145/1807167.1807306
27
Zubiaga A, Spina D, Fresno V, Martínez R. Classifying trending topics: a typology of conversation triggers on Twitter. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 2011, 2461–2464
https://doi.org/10.1145/2063576.2063992
28
Budak C, Agrawal D, Abbadi A E. Structural trend analysis for online social networks. The Proceedings of the VLDB Endowment, 2011, 4(10): 646–656
https://doi.org/10.14778/2021017.2021022
29
Naaman M, Becker H, Gravano L. Hip and trendy: characterizing emerging Trends on twitter. Journal of American Society for Information Science and Techonology, 2011, 62(5): 902–918
https://doi.org/10.1002/asi.21489
30
Sakaki T, Okazaki M, Matsuo Y. Earthquake shakes Twitter users: realtime event detection by social sensors. In: Proceedings of the 19th International World Wide Web Conference. 2010, 851–860
https://doi.org/10.1145/1772690.1772777
31
Aramki E, Maskawa S, Morita M. Twitter catches the flu: Detecting influenza epidemics using Twitter. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2011, 1466–1477
32
Marcus A, Bernstein M S, Badar O, Karger D R, Madden S, Miller R C. Twitinfo: aggregating and visualizing microblogs for event exploration. In: Proceedings of ACM Conference on Human Factors in Computing Systems. 2011, 227–236
https://doi.org/10.1145/1978942.1978975