A scalable and effective full-text search in P2P networks

Yosi Mass*, Yehoshua Sagiv, Michal Shmueli-Scheuer

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

We consider the problem of full-text search involving multi-term queries in a network of self-organizing, autonomous peers. Existing approaches do not scale well with respect to the number of peers, because they either require access to a large number of peers or incur a high communication cost in order to achieve good query results. In this paper, we present a novel algorithmic framework for processing multi-term queries in P2P networks that achieves high recall while using (per-query) a small number of peers and a low communication cost, thereby enabling high query throughput. Our approach is based on per-query peer-selection strategy using two-dimensional histograms of score distributions. A full utilization of the histograms incurs a high communication cost. We show how to drastically reduce this cost by employing a two-phase peer-selection algorithm. We also describe an adaptive approach to peer selection that further increases the recall. Experiments on a large real-world collection show that the recall is indeed high while the number of involved peers and the communication cost are low.

Original languageEnglish
Title of host publicationACM 18th International Conference on Information and Knowledge Management, CIKM 2009
Pages1979-1982
Number of pages4
DOIs
StatePublished - 2009
EventACM 18th International Conference on Information and Knowledge Management, CIKM 2009 - Hong Kong, China
Duration: 2 Nov 20096 Nov 2009

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Conference

ConferenceACM 18th International Conference on Information and Knowledge Management, CIKM 2009
Country/TerritoryChina
CityHong Kong
Period2/11/096/11/09

Keywords

  • Clustering
  • DHT
  • Histograms
  • P2P search

Fingerprint

Dive into the research topics of 'A scalable and effective full-text search in P2P networks'. Together they form a unique fingerprint.

Cite this