RENT—Repeated Elastic Net Technique for Feature Selection

Feature selection is an essential step in data science pipelines to reduce the complexity associated with large datasets. While much research on this topic focuses on optimizing predictive performance, few studies investigate stability in the context of the feature selection process. In this study,...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Anna Jenul, Stefan Schrunner, Kristian Hovde Liland, Ulf Geir Indahl, Cecilia Marie Futsaether, Oliver Tomic
Formato: article
Lenguaje:EN
Publicado: IEEE 2021
Materias:
Acceso en línea:https://doaj.org/article/5b1acb3b57a54811ba6a16ee486dffd9
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:5b1acb3b57a54811ba6a16ee486dffd9
record_format dspace
spelling oai:doaj.org-article:5b1acb3b57a54811ba6a16ee486dffd92021-11-20T00:02:45ZRENT—Repeated Elastic Net Technique for Feature Selection2169-353610.1109/ACCESS.2021.3126429https://doaj.org/article/5b1acb3b57a54811ba6a16ee486dffd92021-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9606766/https://doaj.org/toc/2169-3536Feature selection is an essential step in data science pipelines to reduce the complexity associated with large datasets. While much research on this topic focuses on optimizing predictive performance, few studies investigate stability in the context of the feature selection process. In this study, we present the Repeated Elastic Net Technique (RENT) for Feature Selection. RENT uses an ensemble of generalized linear models with elastic net regularization, each trained on distinct subsets of the training data. The feature selection is based on three criteria evaluating the weight distributions of features across all elementary models. This fact leads to the selection of features with high stability that improve the robustness of the final model. Furthermore, unlike established feature selectors, RENT provides valuable information for model interpretation concerning the identification of objects in the data that are difficult to predict during training. In our experiments, we benchmark RENT against six established feature selectors on eight multivariate datasets for binary classification and regression. In the experimental comparison, RENT shows a well-balanced trade-off between predictive performance and stability. Finally, we underline the additional interpretational value of RENT with an exploratory post-hoc analysis of a healthcare dataset.Anna JenulStefan SchrunnerKristian Hovde LilandUlf Geir IndahlCecilia Marie FutsaetherOliver TomicIEEEarticleElastic net regularizationexploratory analysisensemble feature selectiongeneralized linear modelsselection stabilityElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 9, Pp 152333-152346 (2021)
institution DOAJ
collection DOAJ
language EN
topic Elastic net regularization
exploratory analysis
ensemble feature selection
generalized linear models
selection stability
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
spellingShingle Elastic net regularization
exploratory analysis
ensemble feature selection
generalized linear models
selection stability
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
Anna Jenul
Stefan Schrunner
Kristian Hovde Liland
Ulf Geir Indahl
Cecilia Marie Futsaether
Oliver Tomic
RENT—Repeated Elastic Net Technique for Feature Selection
description Feature selection is an essential step in data science pipelines to reduce the complexity associated with large datasets. While much research on this topic focuses on optimizing predictive performance, few studies investigate stability in the context of the feature selection process. In this study, we present the Repeated Elastic Net Technique (RENT) for Feature Selection. RENT uses an ensemble of generalized linear models with elastic net regularization, each trained on distinct subsets of the training data. The feature selection is based on three criteria evaluating the weight distributions of features across all elementary models. This fact leads to the selection of features with high stability that improve the robustness of the final model. Furthermore, unlike established feature selectors, RENT provides valuable information for model interpretation concerning the identification of objects in the data that are difficult to predict during training. In our experiments, we benchmark RENT against six established feature selectors on eight multivariate datasets for binary classification and regression. In the experimental comparison, RENT shows a well-balanced trade-off between predictive performance and stability. Finally, we underline the additional interpretational value of RENT with an exploratory post-hoc analysis of a healthcare dataset.
format article
author Anna Jenul
Stefan Schrunner
Kristian Hovde Liland
Ulf Geir Indahl
Cecilia Marie Futsaether
Oliver Tomic
author_facet Anna Jenul
Stefan Schrunner
Kristian Hovde Liland
Ulf Geir Indahl
Cecilia Marie Futsaether
Oliver Tomic
author_sort Anna Jenul
title RENT—Repeated Elastic Net Technique for Feature Selection
title_short RENT—Repeated Elastic Net Technique for Feature Selection
title_full RENT—Repeated Elastic Net Technique for Feature Selection
title_fullStr RENT—Repeated Elastic Net Technique for Feature Selection
title_full_unstemmed RENT—Repeated Elastic Net Technique for Feature Selection
title_sort rent—repeated elastic net technique for feature selection
publisher IEEE
publishDate 2021
url https://doaj.org/article/5b1acb3b57a54811ba6a16ee486dffd9
work_keys_str_mv AT annajenul rentx2014repeatedelasticnettechniqueforfeatureselection
AT stefanschrunner rentx2014repeatedelasticnettechniqueforfeatureselection
AT kristianhovdeliland rentx2014repeatedelasticnettechniqueforfeatureselection
AT ulfgeirindahl rentx2014repeatedelasticnettechniqueforfeatureselection
AT ceciliamariefutsaether rentx2014repeatedelasticnettechniqueforfeatureselection
AT olivertomic rentx2014repeatedelasticnettechniqueforfeatureselection
_version_ 1718419834030522368