Redundancy Is Not Necessarily Detrimental in Classification Problems
Autor: | Luis Salgueiro Salgueiro Romero, Laura Raquel Bareiro Paniagua, Francisco Gómez-Vela, Jacques Facon, Deysi Natalia Leguizamon Correa, Julio César Mello Román, Miguel García-Torres, Diego P. Pinto-Roa, Sebastián Alberto Grillo, José Luis Vázquez Noguera |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Computer science
General Mathematics Dimensionality reduction Feature selection computer.software_genre Measure (mathematics) feature selection classification Computer Science (miscellaneous) Redundancy (engineering) Feature (machine learning) QA1-939 Data mining feature construction Engineering (miscellaneous) computer Selection (genetic algorithm) Linear separability Mathematics |
Zdroj: | Mathematics Volume 9 Issue 22 Mathematics, Vol 9, Iss 2899, p 2899 (2021) |
ISSN: | 2227-7390 |
DOI: | 10.3390/math9222899 |
Popis: | In feature selection, redundancy is one of the major concerns since the removal of redundancy in data is connected with dimensionality reduction. Despite the evidence of such a connection, few works present theoretical studies regarding redundancy. In this work, we analyze the effect of redundant features on the performance of classification models. We can summarize the contribution of this work as follows: (i) develop a theoretical framework to analyze feature construction and selection, (ii) show that certain properly defined features are redundant but make the data linearly separable, and (iii) propose a formal criterion to validate feature construction methods. The results of experiments suggest that a large number of redundant features can reduce the classification error. The results imply that it is not enough to analyze features solely using criteria that measure the amount of information provided by such features. |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |