TY - GEN
T1 - Type level clustering evaluation
T2 - 14th Conference on Computational Natural Language Learning, CoNLL 2010
AU - Reichart, Roi
AU - Abend, Omri
AU - Rappoport, Ari
PY - 2010
Y1 - 2010
N2 - Clustering is a central technique in NLP. Consequently, clustering evaluation is of great importance. Many clustering algorithms are evaluated by their success in tagging corpus tokens. In this paper we discuss type level evaluation, which reflects class membership only and is independent of the token statistics of a particular reference corpus. Type level evaluation casts light on the merits of algorithms, and for some applications is a more natural measure of the algorithm's quality. We propose new type level evaluation measures that, contrary to existing measures, are applicable when items are polysemous, the common case in NLP. We demonstrate the benefits of our measures using a detailed case study, POS induction. We experiment with seven leading algorithms, obtaining useful insights and showing that token and type level measures can weakly or even negatively correlate, which underscores the fact that these two approaches reveal different aspects of clustering quality.
AB - Clustering is a central technique in NLP. Consequently, clustering evaluation is of great importance. Many clustering algorithms are evaluated by their success in tagging corpus tokens. In this paper we discuss type level evaluation, which reflects class membership only and is independent of the token statistics of a particular reference corpus. Type level evaluation casts light on the merits of algorithms, and for some applications is a more natural measure of the algorithm's quality. We propose new type level evaluation measures that, contrary to existing measures, are applicable when items are polysemous, the common case in NLP. We demonstrate the benefits of our measures using a detailed case study, POS induction. We experiment with seven leading algorithms, obtaining useful insights and showing that token and type level measures can weakly or even negatively correlate, which underscores the fact that these two approaches reveal different aspects of clustering quality.
UR - http://www.scopus.com/inward/record.url?scp=80053286309&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:80053286309
SN - 9781932432831
T3 - CoNLL 2010 - Fourteenth Conference on Computational Natural Language Learning, Proceedings of the Conference
SP - 77
EP - 87
BT - CoNLL 2010 - Fourteenth Conference on Computational Natural Language Learning, Proceedings of the Conference
Y2 - 15 July 2010 through 16 July 2010
ER -