TY - GEN
T1 - The NVI clustering evaluation measure
AU - Reichart, Roi
AU - Rappoport, Ari
PY - 2009
Y1 - 2009
N2 - Clustering is crucial for many NLP tasks and applications. However, evaluating the results of a clustering algorithm is hard. In this paper we focus on the evaluation setting in which a gold standard solution is available. We discuss two existing information theory based measures, V and VI, and show that they are both hard to use when comparing the performance of different algorithms and different datasets. The V measure favors solutions having a large number of clusters, while the range of scores given by VI depends on the size of the dataset. We present a new measure, NVI, which normalizes VI to address the latter problem. We demonstrate the superiority of NVI in a large experiment involving an important NLP application, grammar induction, using real corpus data in English, German and Chinese.
AB - Clustering is crucial for many NLP tasks and applications. However, evaluating the results of a clustering algorithm is hard. In this paper we focus on the evaluation setting in which a gold standard solution is available. We discuss two existing information theory based measures, V and VI, and show that they are both hard to use when comparing the performance of different algorithms and different datasets. The V measure favors solutions having a large number of clusters, while the range of scores given by VI depends on the size of the dataset. We present a new measure, NVI, which normalizes VI to address the latter problem. We demonstrate the superiority of NVI in a large experiment involving an important NLP application, grammar induction, using real corpus data in English, German and Chinese.
UR - http://www.scopus.com/inward/record.url?scp=84858319308&partnerID=8YFLogxK
U2 - 10.3115/1596374.1596401
DO - 10.3115/1596374.1596401
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84858319308
SN - 1932432299
SN - 9781932432299
T3 - CoNLL 2009 - Proceedings of the Thirteenth Conference on Computational Natural Language Learning
SP - 165
EP - 173
BT - CoNLL 2009 - Proceedings of the Thirteenth Conference on Computational Natural Language Learning
PB - Association for Computational Linguistics (ACL)
T2 - 13th Conference on Computational Natural Language Learning, CoNLL 2009
Y2 - 4 June 2009 through 5 June 2009
ER -