Abstract
State-of-the-art word embeddings, which are often trained on bag-of-words (BOW) contexts, provide a high quality representation of aspects of the semantics of nouns. However, their quality decreases substantially for the task of verb similarity prediction. In this paper we show that using symmetric pattern contexts (SPs, e.g., "X and Y") improves word2vec verb similarity performance by up to 15% and is also instrumental in adjective similarity prediction. The unsupervised SP contexts are even superior to a variety of dependency contexts extracted using a supervised dependency parser. Moreover, we observe that SPs and dependency coordination contexts (Coor) capture a similar type of information, and demonstrate that Coor contexts are superior to other dependency contexts including the set of all dependency contexts, although they are still inferior to SPs. Finally, there are substantially fewer SP contexts compared to alternative representations, leading to a massive reduction in training time. On an 8G words corpus and a 32 core machine, the SP model trains in 11 minutes, compared to 5 and 11 hours with BOW and all dependency contexts, respectively.
Original language | English |
---|---|
Title of host publication | 2016 Conference of the North American Chapter of the Association for Computational Linguistics |
Subtitle of host publication | Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 499-505 |
Number of pages | 7 |
ISBN (Electronic) | 9781941643914 |
DOIs | |
State | Published - 2016 |
Event | 15th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - San Diego, United States Duration: 12 Jun 2016 → 17 Jun 2016 |
Publication series
Name | 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference |
---|
Conference
Conference | 15th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 |
---|---|
Country/Territory | United States |
City | San Diego |
Period | 12/06/16 → 17/06/16 |
Bibliographical note
Publisher Copyright:©2016 Association for Computational Linguistics.