Abstract
Metric validation in Grammatical Error Correction (GEC) is currently done by observing the correlation between human and metric-induced rankings. However, such correlation studies are costly, methodologically troublesome, and suffer from low inter-rater agreement. We propose MAEGE, an automatic methodology for GEC metric validation, that overcomes many of the difficulties with existing practices. Experiments with MAEGE shed a new light on metric quality, showing for example that the standard M2 metric fares poorly on corpus-level ranking. Moreover, we use MAEGE to perform a detailed analysis of metric behavior, showing that correcting some types of errors is consistently penalized by existing metrics.
| Original language | English |
|---|---|
| Title of host publication | ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 1372-1382 |
| Number of pages | 11 |
| ISBN (Electronic) | 9781948087322 |
| DOIs | |
| State | Published - 2018 |
| Event | 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018 - Melbourne, Australia Duration: 15 Jul 2018 → 20 Jul 2018 |
Publication series
| Name | ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) |
|---|---|
| Volume | 1 |
Conference
| Conference | 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018 |
|---|---|
| Country/Territory | Australia |
| City | Melbourne |
| Period | 15/07/18 → 20/07/18 |
Bibliographical note
Publisher Copyright:© 2018 Association for Computational Linguistics