An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data
Feature selection is critical in analyzing microarray data, which has many features (genes) or dimensions. However, with only a few samples the large search space and time consumed during their selection make selecting relevant and informative genes that improve classification performance a complex...
Guardado en:
Autores principales: | , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
IEEE
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/fc9ce6eb64744b63be5363e9993b5579 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:fc9ce6eb64744b63be5363e9993b5579 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:fc9ce6eb64744b63be5363e9993b55792021-11-26T00:01:29ZAn Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data2169-353610.1109/ACCESS.2021.3123090https://doaj.org/article/fc9ce6eb64744b63be5363e9993b55792021-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9585477/https://doaj.org/toc/2169-3536Feature selection is critical in analyzing microarray data, which has many features (genes) or dimensions. However, with only a few samples the large search space and time consumed during their selection make selecting relevant and informative genes that improve classification performance a complex task. This paper proposed a hybrid model for gene selection known as (SVM-mRMRe), the proposed model provides a framework for combining filter-based, ensemble, and embedded methods to select the most relevant and informative genes from high-dimensional microarray data by fusing embedded SVM coefficients (features ranking) with ensemble mRMRe. Eight of the most commonly used microarray datasets for various types of cancer were used to evaluate the model. The selected subset feature is evaluated by four different types of classifiers: random forest (RF), multilayer perceptron (MLP), k-nearest neighbors (k-NN), and Support Vector Machine (SVM). The experimental results show that the proposed model reduces time consumption and dimensionality and improves the differentiation of cancer tissues from benign tissues. Furthermore, the selected genes for the brain cancer dataset are biologically interpreted, and it agrees with the findings of relevant biomedical studies and plays an important role in patient prognosis.Passent El KafrawyHanaa FathiMohammed QaraadAyda K. KelanyXumin ChenIEEEarticleCancer classificationfeature selectiongenomic microarray datasupport vector machineensemble minimum redundancy--maximum relevanceElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 9, Pp 155353-155369 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Cancer classification feature selection genomic microarray data support vector machine ensemble minimum redundancy--maximum relevance Electrical engineering. Electronics. Nuclear engineering TK1-9971 |
spellingShingle |
Cancer classification feature selection genomic microarray data support vector machine ensemble minimum redundancy--maximum relevance Electrical engineering. Electronics. Nuclear engineering TK1-9971 Passent El Kafrawy Hanaa Fathi Mohammed Qaraad Ayda K. Kelany Xumin Chen An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data |
description |
Feature selection is critical in analyzing microarray data, which has many features (genes) or dimensions. However, with only a few samples the large search space and time consumed during their selection make selecting relevant and informative genes that improve classification performance a complex task. This paper proposed a hybrid model for gene selection known as (SVM-mRMRe), the proposed model provides a framework for combining filter-based, ensemble, and embedded methods to select the most relevant and informative genes from high-dimensional microarray data by fusing embedded SVM coefficients (features ranking) with ensemble mRMRe. Eight of the most commonly used microarray datasets for various types of cancer were used to evaluate the model. The selected subset feature is evaluated by four different types of classifiers: random forest (RF), multilayer perceptron (MLP), k-nearest neighbors (k-NN), and Support Vector Machine (SVM). The experimental results show that the proposed model reduces time consumption and dimensionality and improves the differentiation of cancer tissues from benign tissues. Furthermore, the selected genes for the brain cancer dataset are biologically interpreted, and it agrees with the findings of relevant biomedical studies and plays an important role in patient prognosis. |
format |
article |
author |
Passent El Kafrawy Hanaa Fathi Mohammed Qaraad Ayda K. Kelany Xumin Chen |
author_facet |
Passent El Kafrawy Hanaa Fathi Mohammed Qaraad Ayda K. Kelany Xumin Chen |
author_sort |
Passent El Kafrawy |
title |
An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data |
title_short |
An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data |
title_full |
An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data |
title_fullStr |
An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data |
title_full_unstemmed |
An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data |
title_sort |
efficient svm-based feature selection model for cancer classification using high-dimensional microarray data |
publisher |
IEEE |
publishDate |
2021 |
url |
https://doaj.org/article/fc9ce6eb64744b63be5363e9993b5579 |
work_keys_str_mv |
AT passentelkafrawy anefficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata AT hanaafathi anefficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata AT mohammedqaraad anefficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata AT aydakkelany anefficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata AT xuminchen anefficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata AT passentelkafrawy efficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata AT hanaafathi efficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata AT mohammedqaraad efficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata AT aydakkelany efficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata AT xuminchen efficientsvmbasedfeatureselectionmodelforcancerclassificationusinghighdimensionalmicroarraydata |
_version_ |
1718410006437560320 |