Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information

Feature selection is a critical step in the data preprocessing phase in the field of pattern recognition and machine learning. The core of feature selection is to analyze and quantify the relevance, irrelevance, and redundancy between features and class labels. While existing feature selection metho...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Li Zhang, Xiaobo Chen
Formato: article
Lenguaje:EN
Publicado: IEEE 2021
Materias:
Acceso en línea:https://doaj.org/article/1897ab04eefd4547a0398a1341baba7e
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:1897ab04eefd4547a0398a1341baba7e
record_format dspace
spelling oai:doaj.org-article:1897ab04eefd4547a0398a1341baba7e2021-11-19T00:05:56ZFeature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information2169-353610.1109/ACCESS.2021.3049815https://doaj.org/article/1897ab04eefd4547a0398a1341baba7e2021-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9316683/https://doaj.org/toc/2169-3536Feature selection is a critical step in the data preprocessing phase in the field of pattern recognition and machine learning. The core of feature selection is to analyze and quantify the relevance, irrelevance, and redundancy between features and class labels. While existing feature selection methods give multiple explanations for these relationships, they ignore the multi-value bias of class-independent features and the redundancy of class dependent features. Therefore, a feature selection method (Maximal independent classification information and minimal redundancy, MICIMR) is proposed in this paper. Firstly, the relevance and redundancy terms of class independent characteristics are calculated respectively based on the symmetric uncertainty coefficient. Secondly, it calculates the relevance and redundancy terms of class-dependent features according to the independent classification information criterion. Finally, the selection criteria for these two characteristics are combined. To verify the effectiveness of the MICIMR algorithm, five feature selection methods are compared with the MICIMR algorithm on fifteen real datasets. The experimental results demonstrate that the MICIMR algorithm outperforms the other feature selection algorithms in terms of redundancy rate as well as classification accuracy (Gmean_macro and F1_macro).Li ZhangXiaobo ChenIEEEarticleFeature selectionsymmetric uncertainty coefficientindependent classification informationrelevanceredundancyElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 9, Pp 13845-13856 (2021)
institution DOAJ
collection DOAJ
language EN
topic Feature selection
symmetric uncertainty coefficient
independent classification information
relevance
redundancy
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
spellingShingle Feature selection
symmetric uncertainty coefficient
independent classification information
relevance
redundancy
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
Li Zhang
Xiaobo Chen
Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information
description Feature selection is a critical step in the data preprocessing phase in the field of pattern recognition and machine learning. The core of feature selection is to analyze and quantify the relevance, irrelevance, and redundancy between features and class labels. While existing feature selection methods give multiple explanations for these relationships, they ignore the multi-value bias of class-independent features and the redundancy of class dependent features. Therefore, a feature selection method (Maximal independent classification information and minimal redundancy, MICIMR) is proposed in this paper. Firstly, the relevance and redundancy terms of class independent characteristics are calculated respectively based on the symmetric uncertainty coefficient. Secondly, it calculates the relevance and redundancy terms of class-dependent features according to the independent classification information criterion. Finally, the selection criteria for these two characteristics are combined. To verify the effectiveness of the MICIMR algorithm, five feature selection methods are compared with the MICIMR algorithm on fifteen real datasets. The experimental results demonstrate that the MICIMR algorithm outperforms the other feature selection algorithms in terms of redundancy rate as well as classification accuracy (Gmean_macro and F1_macro).
format article
author Li Zhang
Xiaobo Chen
author_facet Li Zhang
Xiaobo Chen
author_sort Li Zhang
title Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information
title_short Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information
title_full Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information
title_fullStr Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information
title_full_unstemmed Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information
title_sort feature selection methods based on symmetric uncertainty coefficients and independent classification information
publisher IEEE
publishDate 2021
url https://doaj.org/article/1897ab04eefd4547a0398a1341baba7e
work_keys_str_mv AT lizhang featureselectionmethodsbasedonsymmetricuncertaintycoefficientsandindependentclassificationinformation
AT xiaobochen featureselectionmethodsbasedonsymmetricuncertaintycoefficientsandindependentclassificationinformation
_version_ 1718420652908609536