TY - JOUR
T1 - Self-supervised relation extraction from the Web
AU - Rozenfeld, Benjamin
AU - Feldman, Ronen
PY - 2008
Y1 - 2008
N2 - Web extraction systems attempt to use the immense amount of unlabeled text in the Web in order to create large lists of entities and relations. Unlike traditional Information Extraction methods, the Web extraction systems do not label every mention of the target entity or relation, instead focusing on extracting as many different instances as possible while keeping the precision of the resulting list reasonably high. SRES is a self-supervised Web relation extraction system that learns powerful extraction patterns from unlabeled text, using short descriptions of the target relations and their attributes. SRES automatically generates the training data needed for its pattern-learning component. The performance of SRES is further enhanced by classifying its output instances using the properties of the instances and the patterns. The features we use for classification and the trained classification model are independent from the target relation, which we demonstrate in a series of experiments. We also compare the performance of SRES to the performance of the state-of-the-art KnowItAll system, and to the performance of its pattern learning component, which learns simpler pattern language than SRES.
AB - Web extraction systems attempt to use the immense amount of unlabeled text in the Web in order to create large lists of entities and relations. Unlike traditional Information Extraction methods, the Web extraction systems do not label every mention of the target entity or relation, instead focusing on extracting as many different instances as possible while keeping the precision of the resulting list reasonably high. SRES is a self-supervised Web relation extraction system that learns powerful extraction patterns from unlabeled text, using short descriptions of the target relations and their attributes. SRES automatically generates the training data needed for its pattern-learning component. The performance of SRES is further enhanced by classifying its output instances using the properties of the instances and the patterns. The features we use for classification and the trained classification model are independent from the target relation, which we demonstrate in a series of experiments. We also compare the performance of SRES to the performance of the state-of-the-art KnowItAll system, and to the performance of its pattern learning component, which learns simpler pattern language than SRES.
KW - Pattern learning
KW - Relationship extraction
KW - Text mining
KW - Unsupervised learning
KW - Web extraction
UR - http://www.scopus.com/inward/record.url?scp=54049086964&partnerID=8YFLogxK
U2 - 10.1007/s10115-007-0110-6
DO - 10.1007/s10115-007-0110-6
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:54049086964
SN - 0219-1377
VL - 17
SP - 17
EP - 33
JO - Knowledge and Information Systems
JF - Knowledge and Information Systems
IS - 1
ER -