TY - GEN
T1 - Query efficiency in probabilistic XML models
AU - Kimelfeld, Benny
AU - Kosharovsky, Yuri
AU - Sagiv, Yehoshua
PY - 2008
Y1 - 2008
N2 - Various known models of probabilistic XML can be represented as instantiations of abstract p-documents. Such documents have, in addition to ordinary nodes, distributional nodes that specify the probabilistic process of generating a random document. Within this abstraction, families of p-documents, which are natural extensions and combinations of previous models, are considered. The focus is on efficiency of applying twig queries (with projection) to p-documents. A closely related issue is the ability to (efficiently) translate a given document of one family into another family. Furthermore, both of these tasks have two variants that correspond to the value-based and object-based semantics. The translation relationships among different families of p-documents are studied. An efficient algorithm for evaluating twig queries over one specific family is given. This algorithm generalizes a known algorithm and significantly improves its running time, both analytically and experimentally. It is shown that this family is the maximal, among the ones considered, for which query evaluation is tractable. For the rest, efficient approximate algorithms for query evaluation are presented.
AB - Various known models of probabilistic XML can be represented as instantiations of abstract p-documents. Such documents have, in addition to ordinary nodes, distributional nodes that specify the probabilistic process of generating a random document. Within this abstraction, families of p-documents, which are natural extensions and combinations of previous models, are considered. The focus is on efficiency of applying twig queries (with projection) to p-documents. A closely related issue is the ability to (efficiently) translate a given document of one family into another family. Furthermore, both of these tasks have two variants that correspond to the value-based and object-based semantics. The translation relationships among different families of p-documents are studied. An efficient algorithm for evaluating twig queries over one specific family is given. This algorithm generalizes a known algorithm and significantly improves its running time, both analytically and experimentally. It is shown that this family is the maximal, among the ones considered, for which query evaluation is tractable. For the rest, efficient approximate algorithms for query evaluation are presented.
KW - Approximate query evaluation
KW - Probabilistic databases
KW - Probabilistic XML
KW - Query optimization
KW - Query processing
UR - http://www.scopus.com/inward/record.url?scp=57149139242&partnerID=8YFLogxK
U2 - 10.1145/1376616.1376687
DO - 10.1145/1376616.1376687
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:57149139242
SN - 9781605581026
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 701
EP - 714
BT - SIGMOD 2008
T2 - 2008 ACM SIGMOD International Conference on Management of Data 2008, SIGMOD'08
Y2 - 9 June 2008 through 12 June 2008
ER -