Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information
Feature selection is a critical step in the data preprocessing phase in the field of pattern recognition and machine learning. The core of feature selection is to analyze and quantify the relevance, irrelevance, and redundancy between features and class labels. While existing feature selection metho...
Guardado en:
Autores principales: | , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
IEEE
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/1897ab04eefd4547a0398a1341baba7e |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:1897ab04eefd4547a0398a1341baba7e |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:1897ab04eefd4547a0398a1341baba7e2021-11-19T00:05:56ZFeature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information2169-353610.1109/ACCESS.2021.3049815https://doaj.org/article/1897ab04eefd4547a0398a1341baba7e2021-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9316683/https://doaj.org/toc/2169-3536Feature selection is a critical step in the data preprocessing phase in the field of pattern recognition and machine learning. The core of feature selection is to analyze and quantify the relevance, irrelevance, and redundancy between features and class labels. While existing feature selection methods give multiple explanations for these relationships, they ignore the multi-value bias of class-independent features and the redundancy of class dependent features. Therefore, a feature selection method (Maximal independent classification information and minimal redundancy, MICIMR) is proposed in this paper. Firstly, the relevance and redundancy terms of class independent characteristics are calculated respectively based on the symmetric uncertainty coefficient. Secondly, it calculates the relevance and redundancy terms of class-dependent features according to the independent classification information criterion. Finally, the selection criteria for these two characteristics are combined. To verify the effectiveness of the MICIMR algorithm, five feature selection methods are compared with the MICIMR algorithm on fifteen real datasets. The experimental results demonstrate that the MICIMR algorithm outperforms the other feature selection algorithms in terms of redundancy rate as well as classification accuracy (Gmean_macro and F1_macro).Li ZhangXiaobo ChenIEEEarticleFeature selectionsymmetric uncertainty coefficientindependent classification informationrelevanceredundancyElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 9, Pp 13845-13856 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Feature selection symmetric uncertainty coefficient independent classification information relevance redundancy Electrical engineering. Electronics. Nuclear engineering TK1-9971 |
spellingShingle |
Feature selection symmetric uncertainty coefficient independent classification information relevance redundancy Electrical engineering. Electronics. Nuclear engineering TK1-9971 Li Zhang Xiaobo Chen Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information |
description |
Feature selection is a critical step in the data preprocessing phase in the field of pattern recognition and machine learning. The core of feature selection is to analyze and quantify the relevance, irrelevance, and redundancy between features and class labels. While existing feature selection methods give multiple explanations for these relationships, they ignore the multi-value bias of class-independent features and the redundancy of class dependent features. Therefore, a feature selection method (Maximal independent classification information and minimal redundancy, MICIMR) is proposed in this paper. Firstly, the relevance and redundancy terms of class independent characteristics are calculated respectively based on the symmetric uncertainty coefficient. Secondly, it calculates the relevance and redundancy terms of class-dependent features according to the independent classification information criterion. Finally, the selection criteria for these two characteristics are combined. To verify the effectiveness of the MICIMR algorithm, five feature selection methods are compared with the MICIMR algorithm on fifteen real datasets. The experimental results demonstrate that the MICIMR algorithm outperforms the other feature selection algorithms in terms of redundancy rate as well as classification accuracy (Gmean_macro and F1_macro). |
format |
article |
author |
Li Zhang Xiaobo Chen |
author_facet |
Li Zhang Xiaobo Chen |
author_sort |
Li Zhang |
title |
Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information |
title_short |
Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information |
title_full |
Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information |
title_fullStr |
Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information |
title_full_unstemmed |
Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information |
title_sort |
feature selection methods based on symmetric uncertainty coefficients and independent classification information |
publisher |
IEEE |
publishDate |
2021 |
url |
https://doaj.org/article/1897ab04eefd4547a0398a1341baba7e |
work_keys_str_mv |
AT lizhang featureselectionmethodsbasedonsymmetricuncertaintycoefficientsandindependentclassificationinformation AT xiaobochen featureselectionmethodsbasedonsymmetricuncertaintycoefficientsandindependentclassificationinformation |
_version_ |
1718420652908609536 |