Investigating the effect of automatic MWE recognition on CCG parsing

Miryam de Lhoneux, Omri Abend, Mark Steedman

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

The objective of this work is to find out whether or not information about Multiword Expressions (MWEs) can improve parsing with Combinatory Categorial Grammar (CCG). Inspired by studies that have shown the benefit of using information about MWEs for parsing, we transform the representation of some MWEs in CCGbank by collapsing them to one token. In contrast with those studies, we use information about MWEs obtained automatically, in order to find out if automatic MWE recognition can be used to help parsing. We look at two different effects that such an approach can lead to. Training on the transformed data can help improve parsing accuracy. We call this a training effect. Transformed data can help the parser in its decisions. We call this a parsing effect. Our model significantly outperforms the baseline model on the transformed gold standard, which indicates that there is a training effect. Our model performs significantly better on the transformed gold standard when the transformation is done before parsing as opposed to after parsing which indicates that there is a parsing effect. We show that these results can lead to improved performance on the non-transformed standard benchmark although we fail to show that it does so significantly. We conclude that despite the limited settings (our transformation algorithm is only able to deal with MWEs that do not cross constituent boundaries), there are noticeable improvements from using MWEs in parsing. We discuss ways in which the incorporation of MWEs into parsing can be improved and hypothesize that this will lead to more substantial results. We obtain different results with recognisers that detect different types of MWE and therefore emphasise the need to experiment with different recognisers. In this way, we can find out what types of MWEs this method is best suited to.
Original languageEnglish
Title of host publicationRepresentation and Parsing of Multiword Expressions
Subtitle of host publicationCurrent trends
EditorsYannick Parmentier, Jakub Waszczuk
Place of PublicationBerlin, Germany
PublisherLanguage Science Press
Chapter7
Pages183-215
Number of pages33
ISBN (Electronic)978-3-96110-145-0
ISBN (Print)978-3-96110-146-7
DOIs
StatePublished - 1 Jun 2019

Publication series

NamePhraseology and Multiword Expressions
PublisherLanguage Science Press
Volume3
ISSN (Print)2625-3127

Keywords

  • Multiword Expressions
  • MWEs
  • Combinatory Categorial Grammar
  • CCG
  • Deep parsing
  • linguistic theories and applications

Fingerprint

Dive into the research topics of 'Investigating the effect of automatic MWE recognition on CCG parsing'. Together they form a unique fingerprint.

Cite this