Keyword proximity search in complex data graphs

Konstantin Golenberg*, Benny Kimelfeld, Yehoshua Sagiv

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

137 Scopus citations

Abstract

In keyword search over data graphs, an answer is a non-redundant subtree that includes the given keywords. An algorithm for enumerating answers is presented within an architecture that has two main components: an engine that generates a set of candidate answers and a ranker that evaluates their score. To be effective, the engine must have three fundamental properties. It should not miss relevant answers, has to be efficient and must generate the answers in an order that is highly correlated with the desired ranking. It is shown that none of the existing systems has implemented an engine that has all of these properties. In contrast, this paper presents an engine that generates all the answers with provable guarantees. Experiments show that the engine performs well in practice. It is also shown how to adapt this engine to queries under the OR semantics. In addition, this paper presents a novel approach for implementing rankers destined for eliminating redundancy. Essentially, an answer is ranked according to its individual properties (relevancy) and its intersection with the answers that have already been presented to the user. Within this approach, experiments with specific rankers are described.

Original languageEnglish
Title of host publicationSIGMOD 2008
Subtitle of host publicationProceedings of the ACM SIGMOD International Conference on Management of Data 2008
Pages927-940
Number of pages14
DOIs
StatePublished - 2008
Event2008 ACM SIGMOD International Conference on Management of Data 2008, SIGMOD'08 - Vancouver, BC, Canada
Duration: 9 Jun 200812 Jun 2008

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Conference

Conference2008 ACM SIGMOD International Conference on Management of Data 2008, SIGMOD'08
Country/TerritoryCanada
CityVancouver, BC
Period9/06/0812/06/08

Keywords

  • Approximate top-k answers
  • Information retrieval on graphs
  • Keyword proximity search
  • Redundancy elimination
  • Subtree enumeration by height

Fingerprint

Dive into the research topics of 'Keyword proximity search in complex data graphs'. Together they form a unique fingerprint.

Cite this