Skip to Main content Skip to Navigation
Conference papers

Verbal Multiword Expression Identification: Do We Need a Sledgehammer to Crack a Nut?

Caroline Pasquer 1 Agata Savary 2 Carlos Ramisch 3 Jean-Yves Antoine
2 BDTLN - Bases de données et traitement des langues naturelles
LIFAT - Laboratoire d'Informatique Fondamentale et Appliquée de Tours
3 TALEP - Traitement Automatique du Langage Ecrit et Parlé
LIS - Laboratoire d'Informatique et Systèmes
Abstract : Automatic identification of multiword expressions (MWEs), like to cut corners 'to do an incomplete job ', is a prerequisite for semantically-oriented downstream applications. This task is challenging because MWEs, especially verbal ones (VMWEs), exhibit surface variability. This paper deals with a subproblem of VMWE identification: the identification of occurrences of previously seen VMWEs. A simple language-independent system based on a combination of filters competes with the best systems from a recent shared task: it obtains the best averaged F-score over 11 languages (0.6653) and even the best score for both seen and unseen VMWEs due to the high proportion of seen VMWEs in texts. This highlights the fact that focusing on the identification of seen VMWEs could be a strategy to improve VMWE identification in general.
Document type :
Conference papers
Complete list of metadata
Contributor : Agata Savary <>
Submitted on : Thursday, November 19, 2020 - 9:26:15 AM
Last modification on : Monday, December 14, 2020 - 5:38:41 PM
Long-term archiving on: : Saturday, February 20, 2021 - 6:27:29 PM


Publisher files allowed on an open archive


  • HAL Id : hal-03013636, version 1


Caroline Pasquer, Agata Savary, Carlos Ramisch, Jean-Yves Antoine. Verbal Multiword Expression Identification: Do We Need a Sledgehammer to Crack a Nut?. The 28th International Conference on Computational Linguistics (COLING-20), Dec 2020, Barcelona, Spain. ⟨hal-03013636⟩



Record views


Files downloads