Kernel weighted least square approach for imputing missing values of metabolomics data

Abstract Mass spectrometry is a modern and sophisticated high-throughput analytical technique that enables large-scale metabolomic analyses. It yields a high-dimensional large-scale matrix (samples × metabolites) of quantified data that often contain missing cells in the data matrix as well as outli...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Nishith Kumar, Md. Aminul Hoque, Masahiro Sugimoto
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/962584cc682c494ab4a1c5087b3765f6
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:962584cc682c494ab4a1c5087b3765f6
record_format dspace
spelling oai:doaj.org-article:962584cc682c494ab4a1c5087b3765f62021-12-02T14:49:11ZKernel weighted least square approach for imputing missing values of metabolomics data10.1038/s41598-021-90654-02045-2322https://doaj.org/article/962584cc682c494ab4a1c5087b3765f62021-05-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-90654-0https://doaj.org/toc/2045-2322Abstract Mass spectrometry is a modern and sophisticated high-throughput analytical technique that enables large-scale metabolomic analyses. It yields a high-dimensional large-scale matrix (samples × metabolites) of quantified data that often contain missing cells in the data matrix as well as outliers that originate for several reasons, including technical and biological sources. Although several missing data imputation techniques are described in the literature, all conventional existing techniques only solve the missing value problems. They do not relieve the problems of outliers. Therefore, outliers in the dataset decrease the accuracy of the imputation. We developed a new kernel weight function-based proposed missing data imputation technique that resolves the problems of missing values and outliers. We evaluated the performance of the proposed method and other conventional and recently developed missing imputation techniques using both artificially generated data and experimentally measured data analysis in both the absence and presence of different rates of outliers. Performances based on both artificial data and real metabolomics data indicate the superiority of our proposed kernel weight-based missing data imputation technique to the existing alternatives. For user convenience, an R package of the proposed kernel weight-based missing value imputation technique was developed, which is available at https://github.com/NishithPaul/tWLSA .Nishith KumarMd. Aminul HoqueMasahiro SugimotoNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-12 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Nishith Kumar
Md. Aminul Hoque
Masahiro Sugimoto
Kernel weighted least square approach for imputing missing values of metabolomics data
description Abstract Mass spectrometry is a modern and sophisticated high-throughput analytical technique that enables large-scale metabolomic analyses. It yields a high-dimensional large-scale matrix (samples × metabolites) of quantified data that often contain missing cells in the data matrix as well as outliers that originate for several reasons, including technical and biological sources. Although several missing data imputation techniques are described in the literature, all conventional existing techniques only solve the missing value problems. They do not relieve the problems of outliers. Therefore, outliers in the dataset decrease the accuracy of the imputation. We developed a new kernel weight function-based proposed missing data imputation technique that resolves the problems of missing values and outliers. We evaluated the performance of the proposed method and other conventional and recently developed missing imputation techniques using both artificially generated data and experimentally measured data analysis in both the absence and presence of different rates of outliers. Performances based on both artificial data and real metabolomics data indicate the superiority of our proposed kernel weight-based missing data imputation technique to the existing alternatives. For user convenience, an R package of the proposed kernel weight-based missing value imputation technique was developed, which is available at https://github.com/NishithPaul/tWLSA .
format article
author Nishith Kumar
Md. Aminul Hoque
Masahiro Sugimoto
author_facet Nishith Kumar
Md. Aminul Hoque
Masahiro Sugimoto
author_sort Nishith Kumar
title Kernel weighted least square approach for imputing missing values of metabolomics data
title_short Kernel weighted least square approach for imputing missing values of metabolomics data
title_full Kernel weighted least square approach for imputing missing values of metabolomics data
title_fullStr Kernel weighted least square approach for imputing missing values of metabolomics data
title_full_unstemmed Kernel weighted least square approach for imputing missing values of metabolomics data
title_sort kernel weighted least square approach for imputing missing values of metabolomics data
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/962584cc682c494ab4a1c5087b3765f6
work_keys_str_mv AT nishithkumar kernelweightedleastsquareapproachforimputingmissingvaluesofmetabolomicsdata
AT mdaminulhoque kernelweightedleastsquareapproachforimputingmissingvaluesofmetabolomicsdata
AT masahirosugimoto kernelweightedleastsquareapproachforimputingmissingvaluesofmetabolomicsdata
_version_ 1718389524767178752