Continual self-supervised domain adaptation for end-to-end speaker diarization - Signaux et Images Accéder directement au contenu
Communication Dans Un Congrès Année : 2023

Continual self-supervised domain adaptation for end-to-end speaker diarization

Résumé

In conventional domain adaptation for speaker diarization, a large collection of annotated conversations from the target domain is required. In this work, we propose a novel continual training scheme for domain adaptation of an end-to-end speaker diarization system, which processes one conversation at a time and benefits from full self-supervision thanks to pseudo-labels. The qualities of our method allow for autonomous adaptation (e.g. of a voice assistant to a new household), while also avoiding permanent storage of possibly sensitive user conversations. We experiment extensively on the 11 domains of the DIHARD III corpus and show the effectiveness of our approach with respect to a pre-trained baseline, achieving a relative 17% performance improvement. We also find that data augmentation and a well-defined target domain are key factors to avoid divergence and to benefit from transfer.
Fichier principal
Vignette du fichier
csda.pdf (585.43 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03824546 , version 1 (21-10-2022)

Identifiants

  • HAL Id : hal-03824546 , version 1

Citer

Juan Manuel Coria, Hervé Bredin, Sahar Ghannay, Sophie Rosset. Continual self-supervised domain adaptation for end-to-end speaker diarization. IEEE Spoken Language Technology Workshop (SLT 2022), IEEE Speech and Language Processing Technical Committee, Jan 2023, Doha, Qatar. à paraître. ⟨hal-03824546⟩
200 Consultations
257 Téléchargements

Partager

Gmail Facebook X LinkedIn More