Investigating the effect of automatic MWE recognition on CCG parsing

Miryam de Lhoneux*, Omri Abend, Mark Steedman

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

3 Scopus citations

Abstract

We investigate the use of automatic Multiword Expressions (MWEs) recognition in parsing with Combinatory Categorial Grammar. We transform the representation of MWEs in CCGbank by collapsing them to one token. Our model significantly outperforms the baseline on the transformed gold standard showing the benefit of having this information at training time. It also performs significantly better on the transformed gold standard when the transformation is done before parsing as opposed to after parsing which shows that it can help the parser at prediction time. We conclude that despite the limited settings (our transformation algorithm is only able to deal with MWEs that do not cross constituent boundaries), our method can lead to improvements. We obtain different results with MWE recognisers that detect different types of MWE and therefore emphasize the need to experiment with different recognisers to find out which ones this method is best suited to.

Original languageEnglish
Title of host publicationRepresentation and Parsing of Multiword Expressions
Subtitle of host publicationCurrent Trends
PublisherLanguage Science Press
Pages183-215
Number of pages33
ISBN (Electronic)9783961101450
ISBN (Print)9783961101467
DOIs
StatePublished - 4 Jul 2019

Bibliographical note

Publisher Copyright:
© 2019, the authors. All rights reserved.

Fingerprint

Dive into the research topics of 'Investigating the effect of automatic MWE recognition on CCG parsing'. Together they form a unique fingerprint.

Cite this