Enhancement of lexical concepts using cross-lingual web mining

Dmitry Davidov*, Ari Rappoport

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

6 Scopus citations

Abstract

Sets of lexical items sharing a significant aspect of their meaning (concepts) are fundamental in linguistics and NLP. Manual concept compilation is labor intensive, error prone and subjective. We present a web-based concept extension algorithm. Given a set of terms specifying a concept in some language, we translate them to a wide range of intermediate languages, disambiguate the translations using web counts, and discover additional concept terms using symmetric patterns. We then translate the discovered terms back into the original language, score them, and extend the original concept by adding back-translations having high scores. We evaluate our method in 3 source languages and 45 intermediate languages, using both human judgments and WordNet. In all cases, our cross-lingual algorithm significantly improves high quality concept extension.

Original languageEnglish
Pages852-861
Number of pages10
DOIs
StatePublished - 2009
Event2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, Held in Conjunction with ACL-IJCNLP 2009 - Singapore, Singapore
Duration: 6 Aug 20097 Aug 2009

Conference

Conference2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, Held in Conjunction with ACL-IJCNLP 2009
Country/TerritorySingapore
CitySingapore
Period6/08/097/08/09

Fingerprint

Dive into the research topics of 'Enhancement of lexical concepts using cross-lingual web mining'. Together they form a unique fingerprint.

Cite this