XSEarch: A Semantic Search Engine for XML

Sara Cohen, Jonathan Mamou, Yaron Kanza, Yehoshua Sagiv

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

6 Scopus citations

Abstract

This chapter describes XSEarch, a semantic search engine for XML. XSEarch has a simple query language, suitable for a naive user. It returns semantically related document fragments that satisfy the user’s query. Query answers are ranked using extended information-retrieval techniques and are generated in an order similar to the ranking. Advanced indexing techniques were developed to facilitate efficient implementation of XSEarch. The performance of the different techniques as well as the recall and the precision were measured experimentally. These experiments indicate that XSEarch is efficient, scalable, and ranks quality results highly. Numerous query languages for XML have been developed. Recently, interest has arisen in techniques for “flexible querying” of XML. For example, the XQuery working group is considering how to add full-text search features and ranking to XQuery. Such capabilities have already been added to various XML query languages. It extends XML-QL with keyword search and presents performance experiments. XIRQL is an extension of XQL that supports vague predicates, weighting of terms, and minimal structural abstracting. XSEarch returns semantically related fragments, ranked by estimated relevance. The chapter concludes that XSEarch can be seen as a general framework for semantic searching in XML documents.

Original languageEnglish
Title of host publicationProceedings 2003 VLDB Conference
Subtitle of host publication29th International Conference on Very Large Databases (VLDB)
PublisherElsevier
Pages45-56
Number of pages12
ISBN (Electronic)9780127224428
DOIs
StatePublished - 1 Jan 2003

Bibliographical note

Publisher Copyright:
© 2003 Elsevier Inc. All rights reserved.

Fingerprint

Dive into the research topics of 'XSEarch: A Semantic Search Engine for XML'. Together they form a unique fingerprint.

Cite this