SWAG: A large-scale adversarial dataset for grounded commonsense inference

Rowan Zellers, Yonatan Bisk, Roy Schwartz, Yejin Choi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

271 Scopus citations

Abstract

Given a partial description like “she opened the hood of the car,” humans can reason about the situation and anticipate what might come next (“then, she examined the engine”). In this paper, we introduce the task of grounded commonsense inference, unifying natural language inference and commonsense reasoning. We present Swag, a new dataset with 113k multiple choice questions about a rich spectrum of grounded situations. To address the recurring challenges of the annotation artifacts and human biases found in many existing datasets, we propose Adversarial Filtering (AF), a novel procedure that constructs a de-biased dataset by iteratively training an ensemble of stylistic classifiers, and using them to filter the data. To account for the aggressive adversarial filtering, we use state-of-the-art language models to massively oversample a diverse set of potential counterfactuals. Empirical results demonstrate that while humans can solve the resulting inference problems with high accuracy (88%), various competitive models struggle on our task. We provide comprehensive analysis that indicates significant opportunities for future research.

Original languageAmerican English
Title of host publicationProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
EditorsEllen Riloff, David Chiang, Julia Hockenmaier, Jun'ichi Tsujii
PublisherAssociation for Computational Linguistics
Pages93-104
Number of pages12
ISBN (Electronic)9781948087841
StatePublished - 2018
Externally publishedYes
Event2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 - Brussels, Belgium
Duration: 31 Oct 20184 Nov 2018

Publication series

NameProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018

Conference

Conference2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
Country/TerritoryBelgium
CityBrussels
Period31/10/184/11/18

Bibliographical note

Funding Information:
We thank the anonymous reviewers, members of the ARK and xlab at the University of Washington, researchers at the Allen Institute for AI, and Luke Zettlemoyer for their helpful feedback. We also thank the Mechanical Turk workers for doing a fantastic job with the human validation. This work was supported by the National Science Foundation Graduate Research Fellowship (DGE-1256082), the NSF grant (IIS-1524371, 1703166), the DARPA CwC program through ARO (W911NF-15-1-0543), the IARPA DIVA program through D17PC00343, and gifts by Google and Facebook. The views and conclusions contained herein are those of the authors and should not be interpreted as representing endorsements of IARPA, DOI/IBC, or the U.S. Government.

Publisher Copyright:
© 2018 Association for Computational Linguistics

Fingerprint

Dive into the research topics of 'SWAG: A large-scale adversarial dataset for grounded commonsense inference'. Together they form a unique fingerprint.

Cite this