Verbal Multiword Expression Identification: Do We Need a Sledgehammer to Crack a Nut?

Caroline Pasquer; Agata Savary; Carlos Ramisch; Jean-Yves Antoine

Communication Dans Un Congrès Année : 2020

Verbal Multiword Expression Identification: Do We Need a Sledgehammer to Crack a Nut?

(1) , (2) , (3) ,

1
2
3

Caroline Pasquer

Fonction : Auteur

Université de Tours

Agata Savary

Fonction : Auteur
PersonId : 4644
IdHAL : agata-savary
IdRef : 113077661

Bases de données et traitement des langues naturelles

Carlos Ramisch

Fonction : Auteur
PersonId : 5103
IdHAL : carlos-ramisch
ORCID : 0000-0001-7466-9039
IdRef : 170720802

Traitement Automatique du Langage Ecrit et Parlé

Jean-Yves Antoine

Fonction : Auteur
PersonId : 4673
IdHAL : jean-yves-antoine
IdRef : 137158319

Résumé

Automatic identification of multiword expressions (MWEs), like to cut corners 'to do an incomplete job ', is a prerequisite for semantically-oriented downstream applications. This task is challenging because MWEs, especially verbal ones (VMWEs), exhibit surface variability. This paper deals with a subproblem of VMWE identification: the identification of occurrences of previously seen VMWEs. A simple language-independent system based on a combination of filters competes with the best systems from a recent shared task: it obtains the best averaged F-score over 11 languages (0.6653) and even the best score for both seen and unseen VMWEs due to the high proportion of seen VMWEs in texts. This highlights the fact that focusing on the identification of seen VMWEs could be a strategy to improve VMWE identification in general.

Domaines

Informatique et langage [cs.CL]

Fichier principal

paper.pdf (845.41 Ko)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Agata Savary : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03013636

Soumis le : jeudi 19 novembre 2020-09:26:15

Dernière modification le : vendredi 22 mars 2024-18:24:04

Archivage à long terme le : samedi 20 février 2021-18:27:29

Dates et versions

hal-03013636 , version 1 (19-11-2020)

Identifiants

HAL Id : hal-03013636 , version 1

Citer

Caroline Pasquer, Agata Savary, Carlos Ramisch, Jean-Yves Antoine. Verbal Multiword Expression Identification: Do We Need a Sledgehammer to Crack a Nut?. The 28th International Conference on Computational Linguistics (COLING-20), Dec 2020, Barcelona, Spain. ⟨hal-03013636⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLN UNIV-TOURS CNRS UNIV-AMU LIBDTLN LIS-LAB LIFAT INSA-GROUPE INSA-CVL ANR INCIAM

148 Consultations

177 Téléchargements

Verbal Multiword Expression Identification: Do We Need a Sledgehammer to Crack a Nut?

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager