TY - GEN
T1 - Certain and possible XPath answers
AU - Cohen, Sara
AU - Weiss, Yaacov Y.
PY - 2013
Y1 - 2013
N2 - Formulating an XPath query over an XML document is a difficult chore for a non-expert user. This paper introduces a novel approach to ease the querying process. Instead of specifying a query, the user simply marks positive examples X+ of nodes that fit her information need. She may also mark negative examples X- of undesirable nodes. A deductive method, to suggest additional nodes that will interest the user, is developed in this paper. To be precise, a node y is a certain answer if every query returning all positive examples X+, and not returning any negative example from X -, must also return y. Similarly, y is a possible answer if there exists a query returning X+ and y, while not returning any node in X-. Thus, y is likely to be of interest to the user if y is a certain answer, and unlikely to be of interest if y is not even a possible answer. The complexity of finding certain and possible answers, with respect to various classes of XPath, is studied. It is shown that for a wide variety of XPath queries (including child and descendant axes, wildcards, branching and attribute constraints), certain and possible answers can be found efficiently, provided that X+ and X- are of bounded size. To prove this result a novel algorithm is developed.
AB - Formulating an XPath query over an XML document is a difficult chore for a non-expert user. This paper introduces a novel approach to ease the querying process. Instead of specifying a query, the user simply marks positive examples X+ of nodes that fit her information need. She may also mark negative examples X- of undesirable nodes. A deductive method, to suggest additional nodes that will interest the user, is developed in this paper. To be precise, a node y is a certain answer if every query returning all positive examples X+, and not returning any negative example from X -, must also return y. Similarly, y is a possible answer if there exists a query returning X+ and y, while not returning any node in X-. Thus, y is likely to be of interest to the user if y is a certain answer, and unlikely to be of interest if y is not even a possible answer. The complexity of finding certain and possible answers, with respect to various classes of XPath, is studied. It is shown that for a wide variety of XPath queries (including child and descendant axes, wildcards, branching and attribute constraints), certain and possible answers can be found efficiently, provided that X+ and X- are of bounded size. To prove this result a novel algorithm is developed.
UR - http://www.scopus.com/inward/record.url?scp=84875584414&partnerID=8YFLogxK
U2 - 10.1145/2448496.2448525
DO - 10.1145/2448496.2448525
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84875584414
SN - 9781450315982
T3 - ACM International Conference Proceeding Series
SP - 237
EP - 248
BT - ICDT 2013 - 16th International Conference on Database Theory, Proceedings
T2 - 16th International Conference on Database Theory, ICDT 2013
Y2 - 18 March 2013 through 22 March 2013
ER -