An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data

Feature selection is critical in analyzing microarray data, which has many features (genes) or dimensions. However, with only a few samples the large search space and time consumed during their selection make selecting relevant and informative genes that improve classification performance a complex...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Passent El Kafrawy, Hanaa Fathi, Mohammed Qaraad, Ayda K. Kelany, Xumin Chen
Formato: article
Lenguaje:EN
Publicado: IEEE 2021
Materias:
Acceso en línea:https://doaj.org/article/fc9ce6eb64744b63be5363e9993b5579
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:fc9ce6eb64744b63be5363e9993b5579
record_format dspace
spelling oai:doaj.org-article:fc9ce6eb64744b63be5363e9993b55792021-11-26T00:01:29ZAn Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data2169-353610.1109/ACCESS.2021.3123090https://doaj.org/article/fc9ce6eb64744b63be5363e9993b55792021-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9585477/https://doaj.org/toc/2169-3536Feature selection is critical in analyzing microarray data, which has many features (genes) or dimensions. However, with only a few samples the large search space and time consumed during their selection make selecting relevant and informative genes that improve classification performance a complex task. This paper proposed a hybrid model for gene selection known as (SVM-mRMRe), the proposed model provides a framework for combining filter-based, ensemble, and embedded methods to select the most relevant and informative genes from high-dimensional microarray data by fusing embedded SVM coefficients (features ranking) with ensemble mRMRe. Eight of the most commonly used microarray datasets for various types of cancer were used to evaluate the model. The selected subset feature is evaluated by four different types of classifiers: random forest (RF), multilayer perceptron (MLP), k-nearest neighbors (k-NN), and Support Vector Machine (SVM). The experimental results show that the proposed model reduces time consumption and dimensionality and improves the differentiation of cancer tissues from benign tissues. Furthermore, the selected genes for the brain cancer dataset are biologically interpreted, and it agrees with the findings of relevant biomedical studies and plays an important role in patient prognosis.Passent El KafrawyHanaa FathiMohammed QaraadAyda K. KelanyXumin ChenIEEEarticleCancer classificationfeature selectiongenomic microarray datasupport vector machineensemble minimum redundancy--maximum relevanceElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 9, Pp 155353-155369 (2021)
institution DOAJ
collection DOAJ
language EN
topic Cancer classification
feature selection
genomic microarray data
support vector machine
ensemble minimum redundancy--maximum relevance
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
spellingShingle Cancer classification
feature selection
genomic microarray data
support vector machine
ensemble minimum redundancy--maximum relevance
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
Passent El Kafrawy
Hanaa Fathi
Mohammed Qaraad
Ayda K. Kelany
Xumin Chen
An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data
description Feature selection is critical in analyzing microarray data, which has many features (genes) or dimensions. However, with only a few samples the large search space and time consumed during their selection make selecting relevant and informative genes that improve classification performance a complex task. This paper proposed a hybrid model for gene selection known as (SVM-mRMRe), the proposed model provides a framework for combining filter-based, ensemble, and embedded methods to select the most relevant and informative genes from high-dimensional microarray data by fusing embedded SVM coefficients (features ranking) with ensemble mRMRe. Eight of the most commonly used microarray datasets for various types of cancer were used to evaluate the model. The selected subset feature is evaluated by four different types of classifiers: random forest (RF), multilayer perceptron (MLP), k-nearest neighbors (k-NN), and Support Vector Machine (SVM). The experimental results show that the proposed model reduces time consumption and dimensionality and improves the differentiation of cancer tissues from benign tissues. Furthermore, the selected genes for the brain cancer dataset are biologically interpreted, and it agrees with the findings of relevant biomedical studies and plays an important role in patient prognosis.
format article
author Passent El Kafrawy
Hanaa Fathi
Mohammed Qaraad
Ayda K. Kelany
Xumin Chen
author_facet Passent El Kafrawy
Hanaa Fathi
Mohammed Qaraad
Ayda K. Kelany
Xumin Chen
author_sort Passent El Kafrawy
title An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data
title_short An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data
title_full An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data
title_fullStr An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data
title_full_unstemmed An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data
title_sort efficient svm-based feature selection model for cancer classification using high-dimensional microarray data
publisher IEEE
publishDate 2021
url https://doaj.org/article/fc9ce6eb64744b63be5363e9993b5579
work_keys_str_mv AT passentelkafrawy anefficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata
AT hanaafathi anefficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata
AT mohammedqaraad anefficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata
AT aydakkelany anefficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata
AT xuminchen anefficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata
AT passentelkafrawy efficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata
AT hanaafathi efficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata
AT mohammedqaraad efficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata
AT aydakkelany efficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata
AT xuminchen efficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata
_version_ 1718410006437560320