Abstract
Despite the tremendous empirical success of neural models in natural language processing, many of them lack the strong intuitions that accompany classical machine learning approaches. Recently, connections have been shown between convolutional neural networks (CNNs) and weighted finite state automata (WFSAs), leading to new interpretations and insights. In this work, we show that some recurrent neural networks also share this connection to WFSAs. We characterize this connection formally, defining rational recurrences to be recurrent hidden state update functions that can be written as the Forward calculation of a finite set of WFSAs. We show that several recent neural models use rational recurrences. Our analysis provides a fresh view of these models and facilitates devising new neural architectures that draw inspiration from WFSAs. We present one such model, which performs better than two recent baselines on language modeling and text classification. Our results demonstrate that transferring intuitions from classical models like WFSAs can be an effective approach to designing and understanding neural models.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 |
Editors | Ellen Riloff, David Chiang, Julia Hockenmaier, Jun'ichi Tsujii |
Publisher | Association for Computational Linguistics |
Pages | 1203-1214 |
Number of pages | 12 |
ISBN (Electronic) | 9781948087841 |
State | Published - 2018 |
Externally published | Yes |
Event | 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 - Brussels, Belgium Duration: 31 Oct 2018 → 4 Nov 2018 |
Publication series
Name | Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 |
---|
Conference
Conference | 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 |
---|---|
Country/Territory | Belgium |
City | Brussels |
Period | 31/10/18 → 4/11/18 |
Bibliographical note
Publisher Copyright:© 2018 Association for Computational Linguistics