Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2015, Vol. 9 Issue (2) : 253-264    https://doi.org/10.1007/s11704-014-3473-3
RESEARCH ARTICLE
Efficient subtree results computation for XML keyword queries
Ziyang CHEN1,3,Jia LIU1,2,*(),Xingmin ZHAO1,3,Junfeng ZHOU1,3
1. School of Information Science and Engineering, Yanshan University, Hebei 066004, China
2. Environmental Management College of China, Hebei 066004, China
3. The Key Laboratory for Computer Virtual Technology and System Integration of HeBei Province, Yanshan University, Hebei 066004, China
 Download: PDF(720 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

In this paper, we focus on efficient construction of restricted subtree (RSubtree) results for XML keyword queries on a multicore system. We firstly show that the performance bottlenecks for existing methods lie in 1) computing the set of relevant keyword nodes(RKNs) for each subtree root node, 2) constructing the corresponding RSubtree, and 3) parallel execution. We then propose a two-step generic top-down subtree construction algorithm, which computes SLCA/ELCA nodes in the first step, and parallelly gets RKNs and generates RSubtree results in the second step, where generic means that 1) our method can be used to compute different kinds of subtree results, 2) our method is independent of the query semantics; top-down means that our method constructs each RSubtree by visiting nodes of the subtree constructed based on an RKN set level-by-level from left to right, such that to avoid visiting as many useless nodes as possible. The experimental results show that our method is much more efficient than existing ones according to various metrics.

Keywords XML      SLCA/ELCA      RSubtree      RKNs      parallel     
Corresponding Author(s): Jia LIU   
Issue Date: 07 April 2015
 Cite this article:   
Ziyang CHEN,Jia LIU,Xingmin ZHAO, et al. Efficient subtree results computation for XML keyword queries[J]. Front. Comput. Sci., 2015, 9(2): 253-264.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-014-3473-3
https://academic.hep.com.cn/fcs/EN/Y2015/V9/I2/253
1 Zhou R, Liu C F, Li J X. Fast ELCA computation for keyword queries on XML data. In: Proceedings of the 13th International Conference on Extending Database Technology. 2010, 549-560
https://doi.org/10.1145/1739041.1739107
2 Xu Y, Papakonstantinou Y. Efficient keyword search for smallest LCAs in XML databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2005, 537-538
https://doi.org/10.1145/1066157.1066217
3 Chen L J, Papakonstantinou Y. Supporting top-K keyword search in XML databases. In: Proceedings of the 26th International Conference on Data Engineering. 2010, 689-700
https://doi.org/10.1109/ICDE.2010.5447818
4 Guo L, Shao F, Botev C, Shanmugasundaram J. XRANK: ranked keyword search over XML Documents. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. 2003, 16-27
https://doi.org/10.1145/872757.872762
5 Li Y Y, Yu C, Jagadish H V. Schema-free XQuery. In: Proceedings of the 13th International Conference on Very Large Data Bases. 2004, 72-83
https://doi.org/10.1016/B978-012088469-8/50010-3
6 Kong L B, Gilleron R, Lemay A. Retrieving meaningful relaxed tightest fragments for XML keyword search. In: Proceedings of the 12th International Conference on Extending Database Technology. 2009, 815-826
https://doi.org/10.1145/1516360.1516454
7 Liu Z Y, Chen Y. Reasoning and identifying relevant matches for XML keyword search. The Proceedings of the VLOB Endowment, 2008, 1(1): 921-932
https://doi.org/10.14778/1453856.1453956
8 Xu Y, Papakonstantinou Y. Efficient LCA based keyword search in XML data. In: Proceedings of the 11th International Conference on Extending Database Technology. 2008, 535-546
https://doi.org/10.1145/1352431.1352496
9 Tatarinov I, Viglas S, Beyer K S, Shanmugasundaram J, Shekita E J, Zhang C. Storing and querying ordered XML using a relational database system. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data. 2002, 204-215
https://doi.org/10.1145/564691.564715
10 Hristidis V, Koudas N, Papakonstantinou Y, Srivastava D. Keyword proximity search in XML Trees. IEEE Transactions on Knowledge and Date Engineering, 2006, 18(4): 525-539
https://doi.org/10.1109/TKDE.2006.1599390
11 Zhou J F, Bao Z F, Chen Z Y, Ling T W. Fast result enumeration for keyword queries on XML data. In: Proceedings of the 17th International Conference on Database Systems for Advanced Applications. 2012, 95-109
https://doi.org/10.1007/978-3-642-29038-1_9
12 Zhou J F, Bao Z F, Wang W, Ling T W, Chen Z Y, Lin X D, Guo J F. Fast SLCA and ELCA computation for XML keyword queries based on set intersection. In: Proceedings of the IEEE 28th International Conference on Data Engineering. 2012, 905-916
https://doi.org/10.1109/ICDE.2012.75
13 Zhou J F, Bao Z F, Chen Z Y, Lan G X, Lin X D, Ling T W. Top-down SLCA computation based on list partition. In: Proceedings of the 17th International Conference on Database Systems for Advanced Applications. 2012, 172-184
https://doi.org/10.1007/978-3-642-29038-1_14
14 Pandis I, Johnson R, Hardavellas N, Ailamaki A. Data-oriented transaction execution. The Proceedings of the VLOB Endowment, 2010, 3(1): 928-939
https://doi.org/10.14778/1920841.1920959
15 Tsirogiannis D, Guha S, Koudas N. Improving the performance of list intersection. The Proceedings of the VLOB Endowment, 2009, 2(1): 838-849
https://doi.org/10.14778/1687627.1687722
16 Bordawekar R, Lim L, Kementsietsidis A, K B. Statistics-based parallelization of XPath queries in shared memory systems. In: Proceedings of the 13th International Conference on Extending Database Technology. 2010, 159-170
https://doi.org/10.1145/1739041.1739063
17 Qin L, Yu J X, Chang L J. Ten thousand SQLs: parallel keyword queries computing. The Proceedings of the VLOB Endowment, 2010, 3(1): 58-69
https://doi.org/10.14778/1920841.1920854
18 Zhou J F, Zhao J J, Wang B, Zhao X M, Chen Z Y. Efficient Msubtree results computation for XML keyword queries. In: Proceedings of the 14th International Conference on Web-Age Information Management. 2013, 472-477
https://doi.org/10.1007/978-3-642-38562-9_48
19 Sun C, Chan C Y, Goenka A K. Multiway SLCA-based keyword search in XML data. In: Proceedings of the 16th International Conference on World Wide Web. 2007, 1043-1052
https://doi.org/10.1145/1242572.1242713
20 Wang W Y, Wang X L, Zhou A Y. Hash-search: an efficient SLCAbased keyword search algorithm on XML documents. In: Proceedings of the 14th International Conference on Database Systems for Advanced Applications. 2009, 496-510
https://doi.org/10.1007/978-3-642-00887-0_44
[1] Supplementary Material-Highlights in 3-page ppt
Download
[1] Bing WEI, Limin XIAO, Bingyu ZHOU, Guangjun QIN, Baicheng YAN, Zhisheng HUO. Fine-grained management of I/O optimizations based on workload characteristics[J]. Front. Comput. Sci., 2021, 15(3): 153102-.
[2] Neng HOU, Fazhi HE, Yi ZHOU, Yilin CHEN. An efficient GPU-based parallel tabu search algorithm for hardware/software co-design[J]. Front. Comput. Sci., 2020, 14(5): 145316-.
[3] Lingli LI, Hongzhi WANG, Jianzhong LI, Hong GAO. A survey of uncertain data management[J]. Front. Comput. Sci., 2020, 14(1): 162-190.
[4] Wenjie LIU, Zhanhuai LI. An efficient parallel algorithm of N-hop neighborhoods on graphs in distributed environment[J]. Front. Comput. Sci., 2019, 13(6): 1309-1325.
[5] Libo FENG, Hui ZHANG, Wei-Tek TSAI, Simeng SUN. System architecture for high-performance permissioned blockchains[J]. Front. Comput. Sci., 2019, 13(6): 1151-1165.
[6] Chengcheng YU, Fan XIA, Weining QIAN, Aoying ZHOU. A parallel data generator for efficiently generating “realistic” social streams[J]. Front. Comput. Sci., 2019, 13(5): 1072-1101.
[7] Pengfei WANG, Kai LU, Gen LI, Xu ZHOU. DFTracker: detecting double-fetch bugs by multi-taint parallel tracking[J]. Front. Comput. Sci., 2019, 13(2): 247-263.
[8] Xuepeng FAN, Xiaofei LIAO, Hai JIN. FunctionFlow: coordinating parallel tasks[J]. Front. Comput. Sci., 2019, 13(1): 73-85.
[9] Diming ZHANG, Fei XUE, Hao HUANG, Shaodi YOU. VBMq: pursuit baremetal performance by embracing block I/O parallelism in virtualization[J]. Front. Comput. Sci., 2018, 12(5): 873-886.
[10] Anil Kumar KARNA, Jinbo DU, Haihao SHEN, Hao ZHONG, Jiong GONG, Haibo YU, Xiangning MA, Jianjun ZHAO. Tuning parallel symbolic execution engine for better performance[J]. Front. Comput. Sci., 2018, 12(1): 86-100.
[11] Donggang CAO, Lianghuan KANG, Hanglong ZHAN, Hong MEI. Towards application-level elasticity on shared cluster: an actor-based approach[J]. Front. Comput. Sci., 2017, 11(5): 803-820.
[12] Guoliang CHEN, Rui MAO, Kezhong LU. A parallel computing framework for big data[J]. Front. Comput. Sci., 2017, 11(4): 608-621.
[13] Ruiji FU,Bing QIN,Ting LIU. Generating Chinese named entity data from parallel corpora[J]. Front. Comput. Sci., 2014, 8(4): 629-641.
[14] Yaobin HE, Haoyu TAN, Wuman LUO, Shengzhong FENG, Jianping FAN. MR-DBSCAN: a scalable MapReduce-based DBSCAN algorithm for heavily skewed data[J]. Front. Comput. Sci., 2014, 8(1): 83-99.
[15] Quan LIU, Xudong YANG, Ling JING, Jin LI, Jiao LI. A parallel scheduling algorithm for reinforcement learning in large state space[J]. Front Comput Sci, 2012, 6(6): 631-646.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed