Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2016, Vol. 10 Issue (2) : 281-291    https://doi.org/10.1007/s11704-015-4432-3
RESEARCH ARTICLE
Efficient multi-event monitoring using built-in search engines
Zhaoman ZHONG1,*(),Zongtian LIU2,Yun HU1,Cunhua LI1
1. School of Computer Engineering, Huaihai Institute of Technology, Lianyungang 222006, China
2. School of Computer Engineering and Science, Shanghai University, Shanghai 200072, China
 Download: PDF(448 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Users of the internet often wish to follow certain news events, and the interests of these users often overlap. General search engines (GSEs) cannot be used to achieve this task due to incomplete coverage and lack of freshness. Instead, a broker is used to regularly query the built-in search engines (BSEs) of news and social media sites. Each user defines an event profile consisting of a set of query rules called event rules (ERs). To ensure that queries match the semantics of BSEs, ERs are transformed into a disjunctive normal form, and separated into conjunctive clauses (atomic event rules, AERs). It is slow to process all AERs on BSEs, and can violate query submission rate limits. Accordingly, the set of AERs is reduced to eliminate AERs that are duplicates, or logically contained by other AERs. Five types of event are selected for experimental comparison and analysis, including natural disasters, accident disasters, public health events, social security events, and negative events of public servants. Using 12 BSEs, 85 ERs for five types of events are defined by five users. Experimental comparison is conducted on three aspects: event rule reduction ratio, number of collected events, and that of related events. Experimental results in this paper show that event rule reduction effectively enhances the efficiency of crawling.

Keywords information retrieval      event retrieval      event monitoring      BSEs      event rule reduction     
Corresponding Author(s): Zhaoman ZHONG   
Just Accepted Date: 16 March 2015   Issue Date: 16 March 2016
 Cite this article:   
Zhaoman ZHONG,Zongtian LIU,Yun HU, et al. Efficient multi-event monitoring using built-in search engines[J]. Front. Comput. Sci., 2016, 10(2): 281-291.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-015-4432-3
https://academic.hep.com.cn/fcs/EN/Y2016/V10/I2/281
1 Lawrence S, Giles CL. Accessibility of information on the Web. Nature, 1999, 107–109
https://doi.org/10.1038/21987
2 Selberg E, Etzioni O. The MetaCrawler architecture for resource aggregation on the Web. IEEE Expert, 1997, 12(1): 11–14
https://doi.org/10.1109/64.577468
3 Fellbaun C, Miller G A. WordNet: A lexical database for the English language [EB/OL]. 2006
4 Li W J, Mu M L, Lu Q, Wei X, Yuan C F. Extractive summarization using inter-and intra-event relevance. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL. 2006, 369–376
https://doi.org/10.3115/1220175.1220222
5 Filatova E, Hatzivassiloglou V. Domain-independent detection, extraction, and labeling of atomic events. In: Proceedings of the 2003 Recent Advances in Natural Language Processing. 2003, 145–152
6 Zhong Z M, Liu Z T, Li C H, Guan Y. Event ontology reasoning based on event class influence factors. International Journal ofMachine Learning and Cybernetics, 2012, 3(2): 133–139
7 Demers A J, Gehrke J, Panda B, Riedewald M, Sharma V, White W. Cayuga: A general purpose event monitoring system. In: Proceeding of Biennial Conference on Innovative Data Systems Research. 2007, 412–422
8 Li C H, Hu Y, Zhong Z M. An event ontology construction approach to web crime mining. In: Proceedings of the 7th International Conference on Fuzzy Systems and Knowledge Discovery. 2010, 2441–2445
https://doi.org/10.1109/fskd.2010.5569290
9 Albakour M D, Macdonald C, Ounis L. Identifying local events by using microblogs as social sensors. In: Proceedings of the 10th International Conference on Open Research Areas in Information Retrieval. 2013, 173–180
10 Lee S J, Lee S, Kim K, Park J. Bursty event detection from text streams for disaster management. In: Proceedings of the International Conference Companion on World Wide Web. 2012, 679–681
https://doi.org/10.1145/2187980.2188179
11 Zhao W X, Chen R H, Fan K, Yan H F, Li X M. A novel burst-based text representation model for scalable event detection. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. 2012, 43–47
12 Zhang L M, Jia Y, Zhou B, Zhao J H, Hong F. Online bursty events detection based on emotions. Chinese Journal of Computers, 2013, 1659–1667
13 Chakrabarti S, Den Berg M V, Dom B. Focused crawling: A new approach to topic-specific web resource discovery. Computer Networks, 1999, 1623–1640
https://doi.org/10.1016/S1389-1286(99)00052-3
14 Medelyan O, Schulz S, Paetzold J, Poprat M, Markó K. Language specific and topic focused web crawling. In: Proceedings of the Language Resources Conference LREC. 2006, 267–269
15 Sotiris B, Euripides G M, Petrakis E M. Improving the performance of focused web crawlers. Data & Knowledge Engineering, 2009, 68(10): 1001–1013
https://doi.org/10.1016/j.datak.2009.04.002
16 Lee Y H, Na S H, Lee J H. Utilizing local evidence for blog feed search. Information Retrieval, 2012, 15(2): 157–177
https://doi.org/10.1007/s10791-011-9176-6
17 Du Y J, Pen Q Q, Gao Z Q. A topic-specific crawling strategy based on semantics similarity. Data & Knowledge Engineering, 2013, 88: 75–93
https://doi.org/10.1016/j.datak.2013.09.003
18 Jiang J T, Song X Y, Yu N H, Lin C Y. Focus: Learning to crawl web forums. IEEE Transactions on Knowledge and Data Engineering, 2013, 1293–1306
https://doi.org/10.1109/TKDE.2012.56
19 Liu L, Peng T. Clustering-based topical web crawling using CFu-tree guided by link-context. Frontiers of Computer Science, 2014, 8(4): 581–595
https://doi.org/10.1007/s11704-014-3050-9
20 Metzler D, Cai C X, Hovy E. Structured event retrieval over microblog archives. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2012, 646–655
21 Steven S, Martine D C, Etienne E K. Reasoning about fuzzy temporal information from the web: towards retrieval of historical events. Soft Computing, 2010, 14(8): 869–886
https://doi.org/10.1007/s00500-009-0471-8
22 Zhong Z M, Zhu P, Li C H, Guan Y, Liu Z T. Research on eventoriented query expansion based on local analysis. Journal of the China Society for Scientific and Technical Information, 2012, 31(2): 151–159
23 Zhong ZM, Li C H, Liu Z T, Dai HW.Web news oriented event multielements retrieval. Journal of Software, 2013, 2366–2378
24 Wu P B, Chen Q X, Ma L. Study on intelligent retrieval of event relevant documents based on event frame. Journal of Chinese Information Processing, 2003, 17(6): 25–30
25 Fu T J, Abbasi A, Chen H C. A focused crawler for dark Web forums. Journal of the American Society for Information Science and Technology, 2010, 61(6): 1213–1231
https://doi.org/10.1002/asi.21323
26 Yang L Y, Li H J, Zhang Y K. The research on classification system of accidental news corpus. In: Proceedings of the 25th Conference on Frontier and Progress of Chinese Information Processing. 2006, 403–409
27 Menczer F, Pant G, Srinivasan P. Topical web crawlers: evaluating adaptive algorithms. ACM Transactions on Internet Technology, 2004, 4(4): 378–419
https://doi.org/10.1145/1031114.1031117
28 Martinez-Romo J, Araujo L. Updating broken Web links: An automatic recommendation system. Information Processing and Management, 2012, 48(2): 183–203
https://doi.org/10.1016/j.ipm.2011.03.006
29 Melanie N, Markus N, Rudolf M, Bianka T. Focused crawling for buildingWeb comment corpora. In: Proceedings of the 10th IEEE Consumer Communications and Networking Conference. 2013, 685–688
[1] FCS-0281-14432-ZMZ_suppl_1 Download
[1] Zhumin CHEN, Xueqi CHENG, Shoubin DONG, Zhicheng DOU, Jiafeng GUO, Xuanjing HUANG, Yanyan LAN, Chenliang LI, Ru LI, Tie-Yan LIU, Yiqun LIU, Jun MA, Bing QIN, Mingwen WANG, Jirong WEN, Jun XU, Min ZHANG, Peng ZHANG, Qi ZHANG. Information retrieval: a view from the Chinese IR community[J]. Front. Comput. Sci., 2021, 15(1): 151601-.
[2] Ilyes KHENNAK, Habiba DRIAS. Strength Pareto fitness assignment for pseudo-relevance feedback: application to MEDLINE[J]. Front. Comput. Sci., 2018, 12(1): 163-176.
[3] Quanqing XU , Bin CUI , Yafei DAI , Hengtao SHEN , Zaiben CHEN , Xiaofang ZHOU , . Hybrid information retrieval policies based on cooperative cache in mobile P2P networks[J]. Front. Comput. Sci., 2009, 3(3): 381-395.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed