TY - JOUR
T1 - Utilizing the multiple facets of WWW contents
AU - Kogan, Yakov
AU - Michaeli, David
AU - Sagiv, Yehoshua
AU - Shmueli, Oded
PY - 1998/12/15
Y1 - 1998/12/15
N2 - Current query languages for the Web (e.g., W3QL, WebLog and WebSQL) explore the structure of the Web. However, usually, the structure of the Web has little to do with the semantics of the data. Therefore, it is practically difficult to pose database queries over the Web. We introduce a new type of tags for denoting the semantics of data stored in HTML pages. These semantic tags (implemented as HTML comments) superimpose on HTML pages semistructured objects in the style of the OEM model. The paper discusses two implemented tools for fully utilizing the semantics. The first is a visualization tool for displaying both the HTML reading of Web pages and the OEM reading of Web pages. The second tool is a query language, similar to LOREL, that can query the HTML structure and/or the OEM reading. The above formalism and tools provide data-modeling capabilities for the Web that fit its heterogeneous nature. Real database queries, taking the OEM point of view, can be formulated, including queries about the schema as well as queries about the HTML structure of Web pages. Therefore, the query language is not restricted to portions of the Web in which semantic tags are used.
AB - Current query languages for the Web (e.g., W3QL, WebLog and WebSQL) explore the structure of the Web. However, usually, the structure of the Web has little to do with the semantics of the data. Therefore, it is practically difficult to pose database queries over the Web. We introduce a new type of tags for denoting the semantics of data stored in HTML pages. These semantic tags (implemented as HTML comments) superimpose on HTML pages semistructured objects in the style of the OEM model. The paper discusses two implemented tools for fully utilizing the semantics. The first is a visualization tool for displaying both the HTML reading of Web pages and the OEM reading of Web pages. The second tool is a query language, similar to LOREL, that can query the HTML structure and/or the OEM reading. The above formalism and tools provide data-modeling capabilities for the Web that fit its heterogeneous nature. Real database queries, taking the OEM point of view, can be formulated, including queries about the schema as well as queries about the HTML structure of Web pages. Therefore, the query language is not restricted to portions of the Web in which semantic tags are used.
KW - Lorel
KW - OEM
KW - OHTML
KW - Query language
KW - Semantic tags
KW - Semistructured data
KW - W3LOREL
KW - WWW
UR - http://www.scopus.com/inward/record.url?scp=0043100100&partnerID=8YFLogxK
U2 - 10.1016/S0169-023X(98)00026-3
DO - 10.1016/S0169-023X(98)00026-3
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:0043100100
SN - 0169-023X
VL - 28
SP - 255
EP - 275
JO - Data and Knowledge Engineering
JF - Data and Knowledge Engineering
IS - 3
ER -