Redundancy Is Not Necessarily Detrimental in Classification Problems

In feature selection, redundancy is one of the major concerns since the removal of redundancy in data is connected with dimensionality reduction. Despite the evidence of such a connection, few works present theoretical studies regarding redundancy. In this work, we analyze the effect of redundant fe...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Sebastián Alberto Grillo, José Luis Vázquez Noguera, Julio César Mello Román, Miguel García-Torres, Jacques Facon, Diego P. Pinto-Roa, Luis Salgueiro Romero, Francisco Gómez-Vela, Laura Raquel Bareiro Paniagua, Deysi Natalia Leguizamon Correa
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Acceso en línea:https://doaj.org/article/59081a7873574c42abfaf89268578daa
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:59081a7873574c42abfaf89268578daa
record_format dspace
spelling oai:doaj.org-article:59081a7873574c42abfaf89268578daa2021-11-25T18:17:01ZRedundancy Is Not Necessarily Detrimental in Classification Problems10.3390/math92228992227-7390https://doaj.org/article/59081a7873574c42abfaf89268578daa2021-11-01T00:00:00Zhttps://www.mdpi.com/2227-7390/9/22/2899https://doaj.org/toc/2227-7390In feature selection, redundancy is one of the major concerns since the removal of redundancy in data is connected with dimensionality reduction. Despite the evidence of such a connection, few works present theoretical studies regarding redundancy. In this work, we analyze the effect of redundant features on the performance of classification models. We can summarize the contribution of this work as follows: (i) develop a theoretical framework to analyze feature construction and selection, (ii) show that certain properly defined features are redundant but make the data linearly separable, and (iii) propose a formal criterion to validate feature construction methods. The results of experiments suggest that a large number of redundant features can reduce the classification error. The results imply that it is not enough to analyze features solely using criteria that measure the amount of information provided by such features.Sebastián Alberto GrilloJosé Luis Vázquez NogueraJulio César Mello RománMiguel García-TorresJacques FaconDiego P. Pinto-RoaLuis Salgueiro RomeroFrancisco Gómez-VelaLaura Raquel Bareiro PaniaguaDeysi Natalia Leguizamon CorreaMDPI AGarticlefeature selectionfeature constructionclassificationMathematicsQA1-939ENMathematics, Vol 9, Iss 2899, p 2899 (2021)
institution DOAJ
collection DOAJ
language EN
topic feature selection
feature construction
classification
Mathematics
QA1-939
spellingShingle feature selection
feature construction
classification
Mathematics
QA1-939
Sebastián Alberto Grillo
José Luis Vázquez Noguera
Julio César Mello Román
Miguel García-Torres
Jacques Facon
Diego P. Pinto-Roa
Luis Salgueiro Romero
Francisco Gómez-Vela
Laura Raquel Bareiro Paniagua
Deysi Natalia Leguizamon Correa
Redundancy Is Not Necessarily Detrimental in Classification Problems
description In feature selection, redundancy is one of the major concerns since the removal of redundancy in data is connected with dimensionality reduction. Despite the evidence of such a connection, few works present theoretical studies regarding redundancy. In this work, we analyze the effect of redundant features on the performance of classification models. We can summarize the contribution of this work as follows: (i) develop a theoretical framework to analyze feature construction and selection, (ii) show that certain properly defined features are redundant but make the data linearly separable, and (iii) propose a formal criterion to validate feature construction methods. The results of experiments suggest that a large number of redundant features can reduce the classification error. The results imply that it is not enough to analyze features solely using criteria that measure the amount of information provided by such features.
format article
author Sebastián Alberto Grillo
José Luis Vázquez Noguera
Julio César Mello Román
Miguel García-Torres
Jacques Facon
Diego P. Pinto-Roa
Luis Salgueiro Romero
Francisco Gómez-Vela
Laura Raquel Bareiro Paniagua
Deysi Natalia Leguizamon Correa
author_facet Sebastián Alberto Grillo
José Luis Vázquez Noguera
Julio César Mello Román
Miguel García-Torres
Jacques Facon
Diego P. Pinto-Roa
Luis Salgueiro Romero
Francisco Gómez-Vela
Laura Raquel Bareiro Paniagua
Deysi Natalia Leguizamon Correa
author_sort Sebastián Alberto Grillo
title Redundancy Is Not Necessarily Detrimental in Classification Problems
title_short Redundancy Is Not Necessarily Detrimental in Classification Problems
title_full Redundancy Is Not Necessarily Detrimental in Classification Problems
title_fullStr Redundancy Is Not Necessarily Detrimental in Classification Problems
title_full_unstemmed Redundancy Is Not Necessarily Detrimental in Classification Problems
title_sort redundancy is not necessarily detrimental in classification problems
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/59081a7873574c42abfaf89268578daa
work_keys_str_mv AT sebastianalbertogrillo redundancyisnotnecessarilydetrimentalinclassificationproblems
AT joseluisvazqueznoguera redundancyisnotnecessarilydetrimentalinclassificationproblems
AT juliocesarmelloroman redundancyisnotnecessarilydetrimentalinclassificationproblems
AT miguelgarciatorres redundancyisnotnecessarilydetrimentalinclassificationproblems
AT jacquesfacon redundancyisnotnecessarilydetrimentalinclassificationproblems
AT diegoppintoroa redundancyisnotnecessarilydetrimentalinclassificationproblems
AT luissalgueiroromero redundancyisnotnecessarilydetrimentalinclassificationproblems
AT franciscogomezvela redundancyisnotnecessarilydetrimentalinclassificationproblems
AT lauraraquelbareiropaniagua redundancyisnotnecessarilydetrimentalinclassificationproblems
AT deysinatalialeguizamoncorrea redundancyisnotnecessarilydetrimentalinclassificationproblems
_version_ 1718411382107406336