ESTIMATION OF MISSING VALUES USING OPTIMISED HYBRID FUZZY C-MEANS AND MAJORITY VOTE FOR MICROARRAY DATA

Missing values are a huge constraint in microarray technologies towards improving and identifying disease-causing genes. Estimating missing values is an undeniable scenario faced by field experts. The imputation method is an effective way to impute the proper values to proceed with the next process...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Shamini Raja Kumaran, Mohd Shahizan Othman, Lizawati Mi Yusuf
Formato: article
Lenguaje:EN
Publicado: UUM Press 2020
Materias:
Acceso en línea:https://doaj.org/article/f49774a11c4548b4b36183a8c53a4532
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:f49774a11c4548b4b36183a8c53a4532
record_format dspace
spelling oai:doaj.org-article:f49774a11c4548b4b36183a8c53a45322021-11-15T07:17:02ZESTIMATION OF MISSING VALUES USING OPTIMISED HYBRID FUZZY C-MEANS AND MAJORITY VOTE FOR MICROARRAY DATA10.32890/jict2020.19.4.11675-414X2180-3862https://doaj.org/article/f49774a11c4548b4b36183a8c53a45322020-08-01T00:00:00Zhttp://e-journal.uum.edu.my/index.php/jict/article/view/jict2020.19.4.1https://doaj.org/toc/1675-414Xhttps://doaj.org/toc/2180-3862Missing values are a huge constraint in microarray technologies towards improving and identifying disease-causing genes. Estimating missing values is an undeniable scenario faced by field experts. The imputation method is an effective way to impute the proper values to proceed with the next process in microarray technology. Missing value imputation methods may increase the classification accuracy. Although these methods might predict the values, classification accuracy rates prove the ability of the methods to identify the missing values in gene expression data. In this study, a novel method, Optimised Hybrid of Fuzzy C-Means and Majority Vote (opt-FCMMV), was proposed to identify the missing values in the data. Using the Majority Vote (MV) and optimisation through Particle Swarm Optimisation (PSO), this study predicted missing values in the data to form more informative and solid data. In order to verify the effectiveness of opt-FCMMV, several experiments were carried out on two publicly available microarray datasets (i.e. Ovary and Lung Cancer) under three missing value mechanisms with five different percentage values in the biomedical domain using Support Vector Machine (SVM) classifier. The experimental results showed that the proposed method functioned efficiently by showcasing the highest accuracy rate as compared to the one without imputations, with imputation by Fuzzy C-Means (FCM), and imputation by Fuzzy C-Means with Majority Vote (FCMMV). For example, the accuracy rates for Ovary Cancer data with 5% missing values were 64.0% for no imputation, 81.8% (FCM), 90.0% (FCMMV), and 93.7% (opt-FCMMV). Such an outcome indicates that the opt-FCMMV may also be applied in different domains in order to prepare the dataset for various data mining tasks. Shamini Raja KumaranMohd Shahizan OthmanLizawati Mi YusufUUM Pressarticlefuzzy c-meansmajority votemissing valuesmicroarray datadata optimisationInformation technologyT58.5-58.64ENJournal of ICT, Vol 19, Iss 4, Pp 459-482 (2020)
institution DOAJ
collection DOAJ
language EN
topic fuzzy c-means
majority vote
missing values
microarray data
data optimisation
Information technology
T58.5-58.64
spellingShingle fuzzy c-means
majority vote
missing values
microarray data
data optimisation
Information technology
T58.5-58.64
Shamini Raja Kumaran
Mohd Shahizan Othman
Lizawati Mi Yusuf
ESTIMATION OF MISSING VALUES USING OPTIMISED HYBRID FUZZY C-MEANS AND MAJORITY VOTE FOR MICROARRAY DATA
description Missing values are a huge constraint in microarray technologies towards improving and identifying disease-causing genes. Estimating missing values is an undeniable scenario faced by field experts. The imputation method is an effective way to impute the proper values to proceed with the next process in microarray technology. Missing value imputation methods may increase the classification accuracy. Although these methods might predict the values, classification accuracy rates prove the ability of the methods to identify the missing values in gene expression data. In this study, a novel method, Optimised Hybrid of Fuzzy C-Means and Majority Vote (opt-FCMMV), was proposed to identify the missing values in the data. Using the Majority Vote (MV) and optimisation through Particle Swarm Optimisation (PSO), this study predicted missing values in the data to form more informative and solid data. In order to verify the effectiveness of opt-FCMMV, several experiments were carried out on two publicly available microarray datasets (i.e. Ovary and Lung Cancer) under three missing value mechanisms with five different percentage values in the biomedical domain using Support Vector Machine (SVM) classifier. The experimental results showed that the proposed method functioned efficiently by showcasing the highest accuracy rate as compared to the one without imputations, with imputation by Fuzzy C-Means (FCM), and imputation by Fuzzy C-Means with Majority Vote (FCMMV). For example, the accuracy rates for Ovary Cancer data with 5% missing values were 64.0% for no imputation, 81.8% (FCM), 90.0% (FCMMV), and 93.7% (opt-FCMMV). Such an outcome indicates that the opt-FCMMV may also be applied in different domains in order to prepare the dataset for various data mining tasks.
format article
author Shamini Raja Kumaran
Mohd Shahizan Othman
Lizawati Mi Yusuf
author_facet Shamini Raja Kumaran
Mohd Shahizan Othman
Lizawati Mi Yusuf
author_sort Shamini Raja Kumaran
title ESTIMATION OF MISSING VALUES USING OPTIMISED HYBRID FUZZY C-MEANS AND MAJORITY VOTE FOR MICROARRAY DATA
title_short ESTIMATION OF MISSING VALUES USING OPTIMISED HYBRID FUZZY C-MEANS AND MAJORITY VOTE FOR MICROARRAY DATA
title_full ESTIMATION OF MISSING VALUES USING OPTIMISED HYBRID FUZZY C-MEANS AND MAJORITY VOTE FOR MICROARRAY DATA
title_fullStr ESTIMATION OF MISSING VALUES USING OPTIMISED HYBRID FUZZY C-MEANS AND MAJORITY VOTE FOR MICROARRAY DATA
title_full_unstemmed ESTIMATION OF MISSING VALUES USING OPTIMISED HYBRID FUZZY C-MEANS AND MAJORITY VOTE FOR MICROARRAY DATA
title_sort estimation of missing values using optimised hybrid fuzzy c-means and majority vote for microarray data
publisher UUM Press
publishDate 2020
url https://doaj.org/article/f49774a11c4548b4b36183a8c53a4532
work_keys_str_mv AT shaminirajakumaran estimationofmissingvaluesusingoptimisedhybridfuzzycmeansandmajorityvoteformicroarraydata
AT mohdshahizanothman estimationofmissingvaluesusingoptimisedhybridfuzzycmeansandmajorityvoteformicroarraydata
AT lizawatimiyusuf estimationofmissingvaluesusingoptimisedhybridfuzzycmeansandmajorityvoteformicroarraydata
_version_ 1718428531070861312