Abstract
We present a novel word level vector representation based on symmetric patterns (SPs). For this aim we automatically acquire SPs (e.g., “X and Y”) from a large corpus of plain text, and generate vectors where each coordinate represents the co-occurrence in SPs of the represented word with another word of the vocabulary. Our representation has three advantages over existing alternatives: First, being based on symmetric word relationships, it is highly suitable for word similarity prediction. Particularly, on the SimLex999 word similarity dataset, our model achieves a Spearman’s ρ score of 0.517, compared to 0.462 of the state-of-the-art word2vec model. Interestingly, our model performs exceptionally well on verbs, outperforming state-of-the-art baselines by 20.2–41.5%. Second, pattern features can be adapted to the needs of a target NLP application. For example, we show that we can easily control whether the embeddings derived from SPs deem antonym pairs (e.g. (big,small)) as similar or dissimilar, an important distinction for tasks such as word classification and sentiment analysis. Finally, we show that a simple combination of the word similarity scores generated by our method and by word2vec results in a superior predictive power over that of each individual model, scoring as high as 0.563 in Spearman’s ρ on SimLex999. This emphasizes the differences between the signals captured by each of the models.
Original language | English |
---|---|
Title of host publication | CoNLL 2015 - 19th Conference on Computational Natural Language Learning, Proceedings |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 258-267 |
Number of pages | 10 |
ISBN (Electronic) | 9781941643778 |
DOIs | |
State | Published - 2015 |
Event | 19th Conference on Computational Natural Language Learning, CoNLL 2015 - Beijing, China Duration: 30 Jul 2015 → 31 Jul 2015 |
Publication series
Name | CoNLL 2015 - 19th Conference on Computational Natural Language Learning, Proceedings |
---|
Conference
Conference | 19th Conference on Computational Natural Language Learning, CoNLL 2015 |
---|---|
Country/Territory | China |
City | Beijing |
Period | 30/07/15 → 31/07/15 |
Bibliographical note
Publisher Copyright:© 2015 Association for Computational Linguistics.