Abstract
Reading comprehension has recently seen rapid progress, with systems matching humans on the most popular datasets for the task. However, a large body of work has highlighted the brittleness of these systems, showing that there is much work left to be done. We introduce a new English reading comprehension benchmark, DROP, which requires Discrete Reasoning Over the content of Paragraphs. In this crowdsourced, adversarially-created, 96k-question benchmark, a system must resolve references in a question, perhaps to multiple input positions, and perform discrete operations over them (such as addition, counting, or sorting). These operations require a much more comprehensive understanding of the content of paragraphs than what was necessary for prior datasets. We apply state-of-the-art methods from both the reading comprehension and semantic parsing literatures on this dataset and show that the best systems only achieve 32.7% F1 on our generalized accuracy metric, while expert human performance is 96.4%. We additionally present a new model that combines reading comprehension methods with simple numerical reasoning to achieve 47.0% F.
| Original language | English |
|---|---|
| Title of host publication | Long and Short Papers |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 2368-2378 |
| Number of pages | 11 |
| ISBN (Electronic) | 9781950737130 |
| State | Published - 2019 |
| Externally published | Yes |
| Event | 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019 - Minneapolis, United States Duration: 2 Jun 2019 → 7 Jun 2019 |
Publication series
| Name | NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference |
|---|---|
| Volume | 1 |
Conference
| Conference | 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019 |
|---|---|
| Country/Territory | United States |
| City | Minneapolis |
| Period | 2/06/19 → 7/06/19 |
Bibliographical note
Publisher Copyright:© 2019 Association for Computational Linguistics
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 4 Quality Education
Fingerprint
Dive into the research topics of 'Drop: A reading comprehension benchmark requiring discrete reasoning over paragraphs'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver