Query evaluation over probabilistic XML

Benny Kimelfeld*, Yuri Kosharovsky, Yehoshua Sagiv

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

44 Scopus citations

Abstract

Query evaluation over probabilistic XML is explored. The queries are twig patterns with projection, and the data is represented in terms of three models of probabilistic XML (that extend existing ones in the literature). The first model makes an assumption of independence among the probabilistic junctions, whereas the second model can encode probabilistic dependencies. The third model combines the first two and, hence, is the most general. An efficient algorithm (under data complexity) is given for query evaluation in the first model. In addition, various optimizations are proposed, and their effectiveness is shown both analytically and experimentally. For the other two models, it is shown that every query is either intractable or trivial. Nonetheless, efficient (additive and multiplicative) approximation algorithms are given for these two models. Finally, Boolean queries are enriched by allowing disjunctions and negations of branches. The above algorithm for the first model is extended to handle these queries. For the other two models, there is an efficient additive approximation, and a multiplicative one also exists if there is no negation; in addition, it is shown that if the query is non-monotonic, then no efficient multiplicative approximation exists unless NP = RP.

Original languageEnglish
Pages (from-to)1117-1140
Number of pages24
JournalVLDB Journal
Volume18
Issue number5
DOIs
StatePublished - Oct 2009

Keywords

  • Approximate query evaluation
  • Probabilistic databases
  • Probabilistic XML
  • Query optimization
  • Query processing

Fingerprint

Dive into the research topics of 'Query evaluation over probabilistic XML'. Together they form a unique fingerprint.

Cite this