TY - JOUR
T1 - Analysis of sentence embedding models using prediction tasks in natural language processing
AU - Adi, Yossi
AU - Kermany, Einat
AU - Belinkov, Yonatan
AU - Lavi, Ofer
AU - Goldberg, Yoav
N1 - Publisher Copyright:
© 2017 IBM.
PY - 2017/7/1
Y1 - 2017/7/1
N2 - The tremendous success of word embeddings in improving the ability of computers to perform natural language tasks has shifted the research on language representation from word representation to focus on sentence representation. This shift introduced a plethora of methods for learning vector representations of sentences, many of them based on compositional methods over word embeddings. These vectors are used as features for subsequent machine learning tasks or for pretraining in the context of deep learning. However, not much is known about the properties that are encoded in these sentence representations and about the language information they encapsulate. Recent studies analyze the encoded representations and the kind of information they capture. In this paper, we analyze results from a previous study on the ability of models to encode basic properties such as content, order, and length. Our analysis led to new insights, such as the effect of word frequency or word distance on the ability to encode content and order.
AB - The tremendous success of word embeddings in improving the ability of computers to perform natural language tasks has shifted the research on language representation from word representation to focus on sentence representation. This shift introduced a plethora of methods for learning vector representations of sentences, many of them based on compositional methods over word embeddings. These vectors are used as features for subsequent machine learning tasks or for pretraining in the context of deep learning. However, not much is known about the properties that are encoded in these sentence representations and about the language information they encapsulate. Recent studies analyze the encoded representations and the kind of information they capture. In this paper, we analyze results from a previous study on the ability of models to encode basic properties such as content, order, and length. Our analysis led to new insights, such as the effect of word frequency or word distance on the ability to encode content and order.
UR - http://www.scopus.com/inward/record.url?scp=85029468769&partnerID=8YFLogxK
U2 - 10.1147/JRD.2017.2702858
DO - 10.1147/JRD.2017.2702858
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85029468769
SN - 0018-8646
VL - 61
JO - IBM Journal of Research and Development
JF - IBM Journal of Research and Development
IS - 4
M1 - 8030297
ER -