Response mixture models based on supervised components: clustering floristic taxa - Institut de Mathématiques et de Modélisation de Montpellier Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2022

Response mixture models based on supervised components: clustering floristic taxa

Résumé

In this paper, we propose to cluster outcomes in order to identify groups predicted by specific explanatory components. A response matrix Y is assumed to depend on a set X of explanatory variables, and a set A of additional covariates. Variables in X are supposed many and redundant, which implies some dimension reduction and regularization. By contrast, A contains few selected variables which are forced into the regression model, as they demand no regularization. The matrix Y is assumed partitioned into G unknown groups of responses. We suppose that the outcomes in each group are predictable from an appropriate number of specific orthogonal supervised components of X. The classification is based on a mixture model of the responses. To estimate the model, we propose a criterion extending that of Supervised Component-based Generalized Linear Regression (SCGLR), a PLS-type method, and develop an algorithm combining those of SCGLR and EM estimation: response mixture SCGLR (rmSCGLR). This new methodology is tested on simulated data and then applied to a floristic ecology dataset.
Fichier principal
Vignette du fichier
preprint.pdf (604.6 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03547177 , version 1 (28-01-2022)
hal-03547177 , version 2 (25-05-2022)
hal-03547177 , version 3 (05-09-2022)

Identifiants

  • HAL Id : hal-03547177 , version 1

Citer

Julien Gibaud, Xavier Bry, Catherine Trottier, Frédéric Mortier, Maxime Réjou-Méchain. Response mixture models based on supervised components: clustering floristic taxa. 2022. ⟨hal-03547177v1⟩
198 Consultations
114 Téléchargements

Partager

Gmail Facebook X LinkedIn More