Abstract
Recurrent and convolutional neural networks comprise two distinct families of models that have proven to be useful for encoding natural language utterances. In this paper we present SoPa, a new model that aims to bridge these two approaches. SoPa combines neural representation learning with weighted finite-state automata (WFSAs) to learn a soft version of traditional surface patterns. We show that SoPa is an extension of a one-layer CNN, and that such CNNs are equivalent to a restricted version of SoPa, and accordingly, to a restricted form of WFSA. Empirically, on three text classification tasks, SoPa is comparable or better than both a BiLSTM (RNN) baseline and a CNN baseline, and is particularly useful in small data settings.
Original language | American English |
---|---|
Title of host publication | ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 295-305 |
Number of pages | 11 |
ISBN (Electronic) | 9781948087322 |
State | Published - 2018 |
Externally published | Yes |
Event | 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018 - Melbourne, Australia Duration: 15 Jul 2018 → 20 Jul 2018 |
Publication series
Name | ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) |
---|---|
Volume | 1 |
Conference
Conference | 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018 |
---|---|
Country/Territory | Australia |
City | Melbourne |
Period | 15/07/18 → 20/07/18 |
Bibliographical note
Funding Information:We thank Dallas Card, Elizabeth Clark, Peter Clark, Bhavana Dalvi, Jesse Dodge, Nicholas FitzGerald, Matt Gardner, Yoav Goldberg, Mark Hopkins, Vidur Joshi, Tushar Khot, Kelvin Luu, Mark Neumann, Hao Peng, Matthew E. Peters, Sasha Rush, Ashish Sabharwal, Minjoon Seo, Sofia Serrano, Swabha Swayamdipta, Chenhao Tan, Niket Tandon, Trang Tran, Mark Yatskar, Scott Yih, Vicki Zayats, Rowan Zellers, Luke Zettlemoyer, and several anonymous reviewers for their helpful advice and feedback. This work was supported in part by NSF grant IIS-1562364, by the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by NSF grant ACI-1548562, and by the NVIDIA Corporation through the donation of a Tesla GPU.
Publisher Copyright:
© 2018 Association for Computational Linguistics