|
|
PSG: a two-layer graph model for document summarization |
Heng CHEN, Hai JIN( ), Feng ZHAO |
Service Computing Technology and System Lab & Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China |
|
|
Abstract Graph model has been widely applied in document summarization by using sentence as the graph node, and the similarity between sentences as the edge. In this paper, a novel graph model for document summarization is presented, that not only sentences relevance but also phrases relevance information included in sentences are utilized. In a word, we construct a phrase-sentence two-layer graph structure model (PSG) to summarize document(s) . We use this model for generic document summarization and query-focused summarization. The experimental results show that our model greatly outperforms existing work.
|
Keywords
relationship graph
Markov random walk
document summarization
|
Corresponding Author(s):
JIN Hai
|
Issue Date: 01 February 2014
|
|
1 |
Wan X, Yang J. Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Informal Retrieval. 2008, 299-306 doi: 10.1145/1390334.1390386
|
2 |
Erkan G, Radev D. Lexrank: graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 2004, 22: 457-479
|
3 |
Wan X, Yang J. Collabsum: exploiting multiple document clustering for collaborative single document summarizations. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Informal Retrieval. 2007, 143-150 doi: 10.1145/1277741.1277768
|
4 |
Radev D, Jing H, Stys M, Tam D. Centroid-based summarization of multiple documents. Information Processing and Management, 2004, 40(6): 919-938 doi: 10.1016/j.ipm.2003.10.006
|
5 |
Mihalcea R. Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. 2004, 20:1-20:4 doi: 10.3115/1219044.1219064
|
6 |
Otterbacher J, Erkan G, Radev D. Using random walks for questionfocused sentence retrieval. In: Proceedings of the 2005 Conference on Human Language Technology and Empirical Methods in Natural Language Processing. 2005, 915-922 doi: 10.3115/1220575.1220690
|
7 |
Zhao L, Wu L, Huang X. Using query expansion in graph-based approach for query-focused multi-document summarization. Information Processing and Management, 2009, 45(1): 35-41 doi: 10.1016/j.ipm.2008.07.001
|
8 |
Wan X, Yang J, Xiao J. Manifold-ranking based topic-focused multidocument summarization. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence. 2007, 2903-2908
|
9 |
Daumé III H, Marcu D. Bayesian query-focused summarization. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. 2006, 305-312
|
10 |
Ramanathan K, Sankarasubramaniam Y, Mathur N, Gupta A. Document summarization using wikipedia. In: Proceedings of the 1st International Conference on Intelligent Human Computer Interaction. 2009, 254-260 doi: 10.1007/978-81-8489-203-1_25
|
11 |
Kumar N, Srinathan K, Varma V. Using wikipedia anchor text and weighted clustering coefficient to enhance the traditional multidocument summarization. Computational Linguistics and Intelligent Text Processing, 2012, 7182: 390-401
|
12 |
Nastase V. Topic-driven multi-document summarization with encyclo pedic knowledge and spreading activation. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. 2008, 763-772
|
13 |
Erkan G, Radev D. Lexpagerank: prestige in multi-document text summarization. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing Chairs. 2004, 365-371
|
14 |
Li L, Shang Y, Zhang W. Improvement of hits-based algorithms on web documents. In: Proceedings of the 11th International Conference on World Wide Web. 2002, 527-535
|
15 |
Radev D, Allison T, Blair-Goldensohn S, Blitzer J. MEAD–a platform for multidocument multilingual text summarization. In: Proceedings of the 4th International Conference on Language Resources and Evaluation. 2004, 699-702
|
16 |
Abu-Jbara A, Radev D. Coherent citation-based summarization of scientific papers. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. 2011, 500-509
|
17 |
Mihalcea R. Language independent extractive summarization. In: Proceedings of the ACL 2005 on Interactive Poster and Demonstration Sessions. 2005, 49-52 doi: 10.3115/1225753.1225766
|
18 |
Cai X, Li W, Ouyang Y, Yan H. Simultaneous ranking and clustering of sentences: a reinforcement approach to multi-document summarization. In: Proceedings of the 23rd International Conference on Computational Linguistics. 2010, 134-142
|
19 |
Feng J, He X, Konte B, Böhm C, Plant C. Summarization-based mining bipartite graphs. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, 1249-1257 doi: 10.1145/2339530.2339725
|
20 |
Alguliev R M, Aliguliyev R M, Isazade N R. CDDS: constraint-driven document summarization models. Expert Systems with Applications, 2013, 40(2): 458-465 doi: 10.1016/j.eswa.2012.07.049
|
21 |
Mukherjee S, Bhattacharyya P. Wikisent: weakly supervised sentiment analysis through extractive summarization with wikipedia. In: Proceedings of the 2012 European Conference on Machine Learning and Knowledge Discovery in Databases. 2012, 774-793
|
22 |
Pourvali M, Abadeh M S. Automated text summarization base on lexicales chain and graph using of wordnet and wikipedia knowledge base. International Journal of Computer Science Issues, 2012, 9(3): 343-349
|
23 |
Wan X. Document-based hits model for multi-document summarization. Lecture Notes in Computer Science, 2008, 5351: 454-465 doi: 10.1007/978-3-540-89197-0_42
|
24 |
Zhang Z, Ge S S, He H. Mutual-reinforcement document summarization using embedded graph based sentence clustering for storytelling. Information Processing and Management, 2012, 48(4): 767-778 doi: 10.1016/j.ipm.2011.12.006
|
25 |
Alguliev R M, Aliguliyev R M, Isazade N R. Multiple documents summarization based on evolutionary optimization algorithm. Expert Systems with Applications, 2013, 40(5): 1675-1689 doi: 10.1016/j.eswa.2012.09.014
|
26 |
Kumar N, Srinathan K. Automatic keyphrase extraction from scientific documents using n-gram filtration technique. In: Proceeding of the 8th ACM Symposium on Document Engineering. 2008, 199-208 doi: 10.1145/1410140.1410180
|
27 |
Cui G, Lu Q, Li W, Chen Y. Mining concepts from wikipedia for ontology construction. In: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference onWeb Intelligence and Intelligent Agent Technology. 2009, 3: 287-290
|
28 |
Wang P, Domeniconi C. Building semantic kernels for text classification using wikipedia. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 713-721 doi: 10.1145/1401890.1401976
|
29 |
Wang P, Hu J, Zeng H, Chen L, Chen Z. Improving text classification by using encyclopedia knowledge. In: Proceedings of the 7th IEEE International Conference on Data Mining. 2007, 332-341
|
30 |
Von Luxburg U. A tutorial on spectral clustering. Statistics and Computing, 2007, 17(4): 395-416 doi: 10.1007/s11222-007-9033-z
|
31 |
Carbonell J, Goldstein J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 1998, 335-336
|
32 |
Xu J, Croft W. Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Information Systems (TOIS), 2000, 18(1): 79-112 doi: 10.1145/333135.333138
|
33 |
Lin C. Rouge: A package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL2004 WorkShop. 2004, 74-81
|
34 |
Hu P, Ji D, Teng C. Co-hits-ranking based query-focused multidocument summarization. Information Retrieval Technology, 2010, 6458: 121-130 doi: 10.1007/978-3-642-17187-1_11
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|