Autor: |
Clouvel, Laura, Iooss, Bertrand, Chabridon, Vincent, Il Idrissi, Marouane, Robin, Frédérique |
Přispěvatelé: |
PERformance et prévention des Risques Industriels du parC par la simuLation et les EtudeS (EDF R&D PERICLES), EDF R&D (EDF R&D), EDF (EDF)-EDF (EDF), Performance, Risque Industriel, Surveillance pour la Maintenance et l’Exploitation (EDF R&D PRISME), Méthodes d'Analyse Stochastique des Codes et Traitements Numériques (GdR MASCOT-NUM), Centre National de la Recherche Scientifique (CNRS), Institut de Mathématiques de Toulouse UMR5219 (IMT), Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)-Université de Toulouse (UT)-Institut National des Sciences Appliquées - Toulouse (INSA Toulouse), Institut National des Sciences Appliquées (INSA)-Université de Toulouse (UT)-Institut National des Sciences Appliquées (INSA)-Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS) |
Jazyk: |
angličtina |
Rok vydání: |
2023 |
Předmět: |
|
Popis: |
In the context of regression analysis, importance measures are effective tools to perform feature selection or to interpret a model by ranking the most influential regressors. In particular, variance-based importance measures (VIMs) are prominent in the field of statistics, but also in the most recent field of global sensitivity analysis, due to their accessible interpretation as variance shares of the explained variable. By focusing on the linear regression model, this work aims at revisiting the overview of the most well-founded methods (some of them being rather old and sometimes, misunderstood), while clarifying their respective positioning, conditions of use, intrinsic capabilities, and interpretation. Some challenges are discussed, such as the case of dependent inputs and the case of a high input dimension. A particular set of VIMs, called the Johnson indices, has shown the potential to solve both previously mentioned challenges. Their equivalence with other known VIMs, the Lindeman-Merenda-Gold (LMG) indices, is proved in the case of two inputs. The practical relevancy of such tools is highlighted through their empirical study on simulated data, as well as public datasets. An application to classification tasks (logistic regression) is also presented in the supplementary material. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|