The NVI clustering evaluation measure

Roi Reichart*, Ari Rappoport

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

31 Scopus citations

Abstract

Clustering is crucial for many NLP tasks and applications. However, evaluating the results of a clustering algorithm is hard. In this paper we focus on the evaluation setting in which a gold standard solution is available. We discuss two existing information theory based measures, V and VI, and show that they are both hard to use when comparing the performance of different algorithms and different datasets. The V measure favors solutions having a large number of clusters, while the range of scores given by VI depends on the size of the dataset. We present a new measure, NVI, which normalizes VI to address the latter problem. We demonstrate the superiority of NVI in a large experiment involving an important NLP application, grammar induction, using real corpus data in English, German and Chinese.

Original languageEnglish
Title of host publicationCoNLL 2009 - Proceedings of the Thirteenth Conference on Computational Natural Language Learning
PublisherAssociation for Computational Linguistics (ACL)
Pages165-173
Number of pages9
ISBN (Print)1932432299, 9781932432299
DOIs
StatePublished - 2009
Event13th Conference on Computational Natural Language Learning, CoNLL 2009 - Boulder, CO, United States
Duration: 4 Jun 20095 Jun 2009

Publication series

NameCoNLL 2009 - Proceedings of the Thirteenth Conference on Computational Natural Language Learning

Conference

Conference13th Conference on Computational Natural Language Learning, CoNLL 2009
Country/TerritoryUnited States
CityBoulder, CO
Period4/06/095/06/09

Fingerprint

Dive into the research topics of 'The NVI clustering evaluation measure'. Together they form a unique fingerprint.

Cite this