TY - GEN
T1 - Automatic selection of high quality parses created by a fully unsupervised parser
AU - Reichart, Roi
AU - Rappoport, Ari
PY - 2009
Y1 - 2009
N2 - The average results obtained by unsupervised statistical parsers have greatly improved in the last few years, but on many specific sentences they are of rather low quality. The output of such parsers is becoming valuable for various applications, and it is radically less expensive to create than manually annotated training data. Hence, automatic selection of high quality parses created by unsupervised parsers is an important problem. In this paper we present PUPA, a POS-based Unsupervised Parse Assessment algorithm. The algorithm assesses the quality of a parse tree using POS sequence statistics collected from a batch of parsed sentences. We evaluate the algorithm by using an unsupervised POS tagger and an unsupervised parser, selecting high quality parsed sentences from English (WSJ) and German (NEGRA) corpora. We show that PUPA outperforms the leading previous parse assessment algorithm for supervised parsers, as well as a strong unsupervised baseline. Consequently, PUPA allows obtaining high quality parses without any human involvement.
AB - The average results obtained by unsupervised statistical parsers have greatly improved in the last few years, but on many specific sentences they are of rather low quality. The output of such parsers is becoming valuable for various applications, and it is radically less expensive to create than manually annotated training data. Hence, automatic selection of high quality parses created by unsupervised parsers is an important problem. In this paper we present PUPA, a POS-based Unsupervised Parse Assessment algorithm. The algorithm assesses the quality of a parse tree using POS sequence statistics collected from a batch of parsed sentences. We evaluate the algorithm by using an unsupervised POS tagger and an unsupervised parser, selecting high quality parsed sentences from English (WSJ) and German (NEGRA) corpora. We show that PUPA outperforms the leading previous parse assessment algorithm for supervised parsers, as well as a strong unsupervised baseline. Consequently, PUPA allows obtaining high quality parses without any human involvement.
UR - http://www.scopus.com/inward/record.url?scp=84862279736&partnerID=8YFLogxK
U2 - 10.3115/1596374.1596400
DO - 10.3115/1596374.1596400
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84862279736
SN - 1932432299
SN - 9781932432299
T3 - CoNLL 2009 - Proceedings of the Thirteenth Conference on Computational Natural Language Learning
SP - 156
EP - 164
BT - CoNLL 2009 - Proceedings of the Thirteenth Conference on Computational Natural Language Learning
PB - Association for Computational Linguistics (ACL)
T2 - 13th Conference on Computational Natural Language Learning, CoNLL 2009
Y2 - 4 June 2009 through 5 June 2009
ER -