TY - GEN
T1 - Translation and extension of concepts across languages
AU - Davidov, Dmitry
AU - Rappoport, Ari
PY - 2009
Y1 - 2009
N2 - We present a method which, given a few words defining a concept in some language, retrieves, disambiguates and extends corresponding terms that define a similar concept in another specified language. This can be very useful for cross-lingual information retrieval and the preparation of multi-lingual lexical resources. We automatically obtain term translations from multilingual dictionaries and disambiguate them using web counts. We then retrieve web snippets with co-occurring translations, and discover additional concept terms from these snippets. Our term discovery is based on co-appearance of similar words in symmetric patterns. We evaluate our method on a set of language pairs involving 45 languages, including combinations of very dissimilar ones such as Russian, Chinese, and Hebrew for various concepts. We assess the quality of the retrieved sets using both human judgments and automatically comparing the obtained categories to corresponding English WordNet synsets.
AB - We present a method which, given a few words defining a concept in some language, retrieves, disambiguates and extends corresponding terms that define a similar concept in another specified language. This can be very useful for cross-lingual information retrieval and the preparation of multi-lingual lexical resources. We automatically obtain term translations from multilingual dictionaries and disambiguate them using web counts. We then retrieve web snippets with co-occurring translations, and discover additional concept terms from these snippets. Our term discovery is based on co-appearance of similar words in symmetric patterns. We evaluate our method on a set of language pairs involving 45 languages, including combinations of very dissimilar ones such as Russian, Chinese, and Hebrew for various concepts. We assess the quality of the retrieved sets using both human judgments and automatically comparing the obtained categories to corresponding English WordNet synsets.
UR - http://www.scopus.com/inward/record.url?scp=84555216904&partnerID=8YFLogxK
U2 - 10.3115/1609067.1609086
DO - 10.3115/1609067.1609086
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84555216904
SN - 9781932432169
T3 - EACL 2009 - 12th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings
SP - 175
EP - 183
BT - EACL 2009 - 12th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings
PB - Association for Computational Linguistics (ACL)
T2 - 12th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2009
Y2 - 30 March 2009 through 3 April 2009
ER -