Elitist Binary Wolf Search Algorithm for Heuristic Feature Selection in High-Dimensional Bioinformatics Datasets

Abstract Due to the high-dimensional characteristics of dataset, we propose a new method based on the Wolf Search Algorithm (WSA) for optimising the feature selection problem. The proposed approach uses the natural strategy established by Charles Darwin; that is, ‘It is not the strongest of the spec...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Jinyan Li, Simon Fong, Raymond K. Wong, Richard Millham, Kelvin K. L. Wong
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2017
Materias:
R
Q
Acceso en línea:https://doaj.org/article/b1be323aba084e8ea579cefa3cc848a8
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:b1be323aba084e8ea579cefa3cc848a8
record_format dspace
spelling oai:doaj.org-article:b1be323aba084e8ea579cefa3cc848a82021-12-02T15:05:09ZElitist Binary Wolf Search Algorithm for Heuristic Feature Selection in High-Dimensional Bioinformatics Datasets10.1038/s41598-017-04037-52045-2322https://doaj.org/article/b1be323aba084e8ea579cefa3cc848a82017-06-01T00:00:00Zhttps://doi.org/10.1038/s41598-017-04037-5https://doaj.org/toc/2045-2322Abstract Due to the high-dimensional characteristics of dataset, we propose a new method based on the Wolf Search Algorithm (WSA) for optimising the feature selection problem. The proposed approach uses the natural strategy established by Charles Darwin; that is, ‘It is not the strongest of the species that survives, but the most adaptable’. This means that in the evolution of a swarm, the elitists are motivated to quickly obtain more and better resources. The memory function helps the proposed method to avoid repeat searches for the worst position in order to enhance the effectiveness of the search, while the binary strategy simplifies the feature selection problem into a similar problem of function optimisation. Furthermore, the wrapper strategy gathers these strengthened wolves with the classifier of extreme learning machine to find a sub-dataset with a reasonable number of features that offers the maximum correctness of global classification models. The experimental results from the six public high-dimensional bioinformatics datasets tested demonstrate that the proposed method can best some of the conventional feature selection methods up to 29% in classification accuracy, and outperform previous WSAs by up to 99.81% in computational time.Jinyan LiSimon FongRaymond K. WongRichard MillhamKelvin K. L. WongNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 7, Iss 1, Pp 1-14 (2017)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Jinyan Li
Simon Fong
Raymond K. Wong
Richard Millham
Kelvin K. L. Wong
Elitist Binary Wolf Search Algorithm for Heuristic Feature Selection in High-Dimensional Bioinformatics Datasets
description Abstract Due to the high-dimensional characteristics of dataset, we propose a new method based on the Wolf Search Algorithm (WSA) for optimising the feature selection problem. The proposed approach uses the natural strategy established by Charles Darwin; that is, ‘It is not the strongest of the species that survives, but the most adaptable’. This means that in the evolution of a swarm, the elitists are motivated to quickly obtain more and better resources. The memory function helps the proposed method to avoid repeat searches for the worst position in order to enhance the effectiveness of the search, while the binary strategy simplifies the feature selection problem into a similar problem of function optimisation. Furthermore, the wrapper strategy gathers these strengthened wolves with the classifier of extreme learning machine to find a sub-dataset with a reasonable number of features that offers the maximum correctness of global classification models. The experimental results from the six public high-dimensional bioinformatics datasets tested demonstrate that the proposed method can best some of the conventional feature selection methods up to 29% in classification accuracy, and outperform previous WSAs by up to 99.81% in computational time.
format article
author Jinyan Li
Simon Fong
Raymond K. Wong
Richard Millham
Kelvin K. L. Wong
author_facet Jinyan Li
Simon Fong
Raymond K. Wong
Richard Millham
Kelvin K. L. Wong
author_sort Jinyan Li
title Elitist Binary Wolf Search Algorithm for Heuristic Feature Selection in High-Dimensional Bioinformatics Datasets
title_short Elitist Binary Wolf Search Algorithm for Heuristic Feature Selection in High-Dimensional Bioinformatics Datasets
title_full Elitist Binary Wolf Search Algorithm for Heuristic Feature Selection in High-Dimensional Bioinformatics Datasets
title_fullStr Elitist Binary Wolf Search Algorithm for Heuristic Feature Selection in High-Dimensional Bioinformatics Datasets
title_full_unstemmed Elitist Binary Wolf Search Algorithm for Heuristic Feature Selection in High-Dimensional Bioinformatics Datasets
title_sort elitist binary wolf search algorithm for heuristic feature selection in high-dimensional bioinformatics datasets
publisher Nature Portfolio
publishDate 2017
url https://doaj.org/article/b1be323aba084e8ea579cefa3cc848a8
work_keys_str_mv AT jinyanli elitistbinarywolfsearchalgorithmforheuristicfeatureselectioninhighdimensionalbioinformaticsdatasets
AT simonfong elitistbinarywolfsearchalgorithmforheuristicfeatureselectioninhighdimensionalbioinformaticsdatasets
AT raymondkwong elitistbinarywolfsearchalgorithmforheuristicfeatureselectioninhighdimensionalbioinformaticsdatasets
AT richardmillham elitistbinarywolfsearchalgorithmforheuristicfeatureselectioninhighdimensionalbioinformaticsdatasets
AT kelvinklwong elitistbinarywolfsearchalgorithmforheuristicfeatureselectioninhighdimensionalbioinformaticsdatasets
_version_ 1718388915697614848