Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2008, Vol. 2 Issue (3) : 234-247    https://doi.org/10.1007/s11704-008-0026-7
Supporting nearest neighbors query on high-dimensional data in P2P systems
LI Mei1, LEE Wang-Chien1, SIVASUBRAMANIAM Anand1, ZHAO Jizhong2
1.Department of Computer Science and Engineering, The Pennsylvania State University, Philadelphia; 2.Department of Computer Science and Technology, Xi'an Jiaotong University;
 Download: PDF(455 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract Peer-to-peer systems have been widely used for sharing and exchanging data and resources among numerous computer nodes. Various data objects identifiable with high dimensional feature vectors, such as text, images, genome sequences, are starting to leverage P2P technology. Most of the existing works have been focusing on queries on data objects with one or few attributes and thus are not applicable on high dimensional data objects. In this study, we investigate K nearest neighbors query (KNN) on high dimensional data objects in P2P systems. Efficient query algorithm and solutions that address various technical challenges raised by high dimensionality, such as search space resolution and incremental search space refinement, are proposed. An extensive simulation using both synthetic and real data sets demonstrates that our proposal efficiently supports KNN query on high dimensional data in P2P systems.
Issue Date: 05 September 2008
 Cite this article:   
LEE Wang-Chien,LI Mei,SIVASUBRAMANIAM Anand, et al. Supporting nearest neighbors query on high-dimensional data in P2P systems[J]. Front. Comput. Sci., 2008, 2(3): 234-247.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-008-0026-7
https://academic.hep.com.cn/fcs/EN/Y2008/V2/I3/234
1 Ratnasamy S, Francis P, Handley M, et al.. Scalable, distributed object location and routingfor large-scale peer-to-peer systems . In: Proceedings of ACM SIGCOMM 2001New York: ACM Press, 2001, 161–172
2 Stoica I, Morris R, Karger D, et al.. Chord: A scalable peer-to-peer lookup servicefor Internet applications. In: Proceedingsof ACM SIGCOMM 2001. New York: ACM Press, 2001, 149–160
3 Rowstron A I T, Druschel P . Pastry: Scalable, distributedobject location and routing for large-scale peer-to-peer systems. In: Proceedings of IFIP/ACM International Conferenceon Distributed Systems Platforms (Middleware).New York: ACM Press, 2001, 329–350
4 Zhao B Y, Kubiatowicz J D, Joseph A D . Tapestry: an infrastructure for fault-tolerant wide-arealocation and routing. Technical Report UCS/CSD-01-1141, Computer ScienceDivision, U. C.Berkeley, 2001
5 Andrzejak A, Xu Z . Scalable, efficient rangequeries for grid information services. In: Proceedings of IEEE International Conference on Peer-to-Peer Computing(P2P).Wahsington D.C.: IEEE ComputingSoceity, 2002, 33–40
6 Aspnes J, Shah G . Skip graphs. In: Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA).New York: ACM Press, 2003, 384–393
7 Bharambe A R, Agrawal M, Seshan S . Mercury: Supporting scalable multi-attribute range queries. In: Proceedings of ACM SIGCOMM.New York: ACM Press, 2004, 353–366
8 Ganesan P, Bawa M, Garcia-Molina H . Online balancing of range-partitioned data with applicationsto peer-to-peer systems. In: Proceedingsof International Conference on Very Large Data Bases (VLDB).VLDB Endowment, 2004, 444–455
9 Gao J, Steenkiste P . An adaptive protocol forefficient support of range queries in DHT-based systems. In: Proceedings of IEEE International Conferenceon Network Protocols (ICNP). WashingtonD.C.: IEEE Computer Society, 2004, 239–250
10 Gupta A, Agrawal D, Abbadi A E . Approximate range selection queries in peer-to-peer systems. In: Proceedings of Biennial Conference on InnovativeData Systems Research (CIDR), 2003
11 Sahin O, Gupta A, Agrawal D, et al.. A peer-to-peer framework for caching range queries. In: Proceedings of International Conference onData Engineering (ICDE).WashintonD.C.: IEEE Computer Society, 2004, 165–176
12 Shu Y, Ooi B C, Tan KL, et al.. Supporting multi-dimensional range queries inpeer-to-peer systems. In: Proceedings ofIEEE International Conference on Peer-to-Peer Computing (P2P).Washington D.C.: IEEE Computer Society, 2005, 173–180
13 Banaei-Kashani F, Shahabi C . SWAM: a family of accessmethods for similarity-search in peer-to-peer data networks. In: Proceedings of ACM Conference on Informationand Knowledge Management (CIKM).New York: ACM Press, 2004, 304–313
14 Jagadish H V, Ooi BC, Vu Q H, et al.. VBI-Tree: a peer-to-peer framework for supportingmulti-dimensional indexing schemes. In: Proceedings of International Conference on Data Engineering (ICDE), 2006
15 Li M, Lee W-C, Sivasubramaniam A . DPTree: a balanced tree based indexingframework for peer-to-peer systems. In: Proceedings of International Conference on Network Protocols (ICNP). Washington D.C.: IEEE Computer Society, 2006, 12–21
16 Liu B, Lee W-C, Lee D L . Supporting complex multi-dimensional queries in P2P systems. In: Proceedings of International Conference onDistributed Computing Systems (ICDCS), 2005, 155–164
17 Tanin E, Nayar D, Samet H . An efficient nearest neighbor algorithm for P2P settings. In: Proceedings of National Conference on DigitalGovernment Research, 2005, 21–28
18 Li M, Lee W-C, Sivasubramaniam A . Semantic small world: An overlay networkfor peer-to-peer search. In: Proceedingsof International Conference on Network Protocols (ICNP).Washington D.C.: IEEE Computer Society, 2004, 228–238
19 Li M, Lee W-C, Sivasubramaniam A, et al.. Ssw: a small world based overlayfor peer-to-peer search. IEEE Transactionon Parallel and Distributed Systems, 2008, 19(2): 735–749.
doi:10.1109/TPDS.2007.70757
20 Ganesan P, Yang B, Garcia-Molina B . One torus to rule them all: Multidimensional queriesin P2P systems. In: Proceedings of InternationalWorkshop on the Web and Databases (WebDB), 2004, 19–24
21 Tang C, Xu Z, Dwarkadas S . Peer-to-peer information retrieval using self-organizingsemantic overlay networks. In: Proceedingsof ACM SIGCOMM. New York: AMC Press, 2003, 175–186
22 Müller W, Henrich A . Fast retrieval of high-dimensionalfeature vectors in P2P networks using compact peer data summaries. In: Proceedings of ACM SIGMM International Workshopon Multimedia Information Retrieval (MIR).New York: ACM Press, 2003, 79–86
23 Aberer K . P-Grid:a self-organizing access structure for P2P information systems. In: Proceedings of International Conference onCooperative Information Systems (CoopIS) 2001, 179–194
24 Crainiceanu A, Linga P, Gehrke J, et al.. Querying peer-to-peer networks using P-trees. In: Proceedings of International Workshop on theWeb and Databases (WebDB).New York: ACM Press, 2004, 25–30
25 Houle M . E, Sakuma J . Fast approximate similaritysearch in extremely high-dimensional data sets. In: Proceedings of International Conference on Data Engineering (ICDE). Washinton DC.: IEEE Computer Society, 2005, 619–630
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed