Abstract
We investigate the use of automatic Multiword Expressions (MWEs) recognition in parsing with Combinatory Categorial Grammar. We transform the representation of MWEs in CCGbank by collapsing them to one token. Our model significantly outperforms the baseline on the transformed gold standard showing the benefit of having this information at training time. It also performs significantly better on the transformed gold standard when the transformation is done before parsing as opposed to after parsing which shows that it can help the parser at prediction time. We conclude that despite the limited settings (our transformation algorithm is only able to deal with MWEs that do not cross constituent boundaries), our method can lead to improvements. We obtain different results with MWE recognisers that detect different types of MWE and therefore emphasize the need to experiment with different recognisers to find out which ones this method is best suited to.
Original language | English |
---|---|
Title of host publication | Representation and Parsing of Multiword Expressions |
Subtitle of host publication | Current Trends |
Publisher | Language Science Press |
Pages | 183-215 |
Number of pages | 33 |
ISBN (Electronic) | 9783961101450 |
ISBN (Print) | 9783961101467 |
DOIs | |
State | Published - 4 Jul 2019 |
Bibliographical note
Publisher Copyright:© 2019, the authors. All rights reserved.