Kernel weighted least square approach for imputing missing values of metabolomics data

Abstract Mass spectrometry is a modern and sophisticated high-throughput analytical technique that enables large-scale metabolomic analyses. It yields a high-dimensional large-scale matrix (samples × metabolites) of quantified data that often contain missing cells in the data matrix as well as outli...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Nishith Kumar, Md. Aminul Hoque, Masahiro Sugimoto
Formato:	article
Lenguaje:	EN
Publicado:	Nature Portfolio 2021
Materias:	Medicine R Science Q
Acceso en línea:	https://doaj.org/article/962584cc682c494ab4a1c5087b3765f6
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:962584cc682c494ab4a1c5087b3765f6
record_format	dspace
spelling	oai:doaj.org-article:962584cc682c494ab4a1c5087b3765f62021-12-02T14:49:11ZKernel weighted least square approach for imputing missing values of metabolomics data10.1038/s41598-021-90654-02045-2322https://doaj.org/article/962584cc682c494ab4a1c5087b3765f62021-05-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-90654-0https://doaj.org/toc/2045-2322Abstract Mass spectrometry is a modern and sophisticated high-throughput analytical technique that enables large-scale metabolomic analyses. It yields a high-dimensional large-scale matrix (samples × metabolites) of quantified data that often contain missing cells in the data matrix as well as outliers that originate for several reasons, including technical and biological sources. Although several missing data imputation techniques are described in the literature, all conventional existing techniques only solve the missing value problems. They do not relieve the problems of outliers. Therefore, outliers in the dataset decrease the accuracy of the imputation. We developed a new kernel weight function-based proposed missing data imputation technique that resolves the problems of missing values and outliers. We evaluated the performance of the proposed method and other conventional and recently developed missing imputation techniques using both artificially generated data and experimentally measured data analysis in both the absence and presence of different rates of outliers. Performances based on both artificial data and real metabolomics data indicate the superiority of our proposed kernel weight-based missing data imputation technique to the existing alternatives. For user convenience, an R package of the proposed kernel weight-based missing value imputation technique was developed, which is available at https://github.com/NishithPaul/tWLSA .Nishith KumarMd. Aminul HoqueMasahiro SugimotoNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-12 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Medicine R Science Q
spellingShingle	Medicine R Science Q Nishith Kumar Md. Aminul Hoque Masahiro Sugimoto Kernel weighted least square approach for imputing missing values of metabolomics data
description	Abstract Mass spectrometry is a modern and sophisticated high-throughput analytical technique that enables large-scale metabolomic analyses. It yields a high-dimensional large-scale matrix (samples × metabolites) of quantified data that often contain missing cells in the data matrix as well as outliers that originate for several reasons, including technical and biological sources. Although several missing data imputation techniques are described in the literature, all conventional existing techniques only solve the missing value problems. They do not relieve the problems of outliers. Therefore, outliers in the dataset decrease the accuracy of the imputation. We developed a new kernel weight function-based proposed missing data imputation technique that resolves the problems of missing values and outliers. We evaluated the performance of the proposed method and other conventional and recently developed missing imputation techniques using both artificially generated data and experimentally measured data analysis in both the absence and presence of different rates of outliers. Performances based on both artificial data and real metabolomics data indicate the superiority of our proposed kernel weight-based missing data imputation technique to the existing alternatives. For user convenience, an R package of the proposed kernel weight-based missing value imputation technique was developed, which is available at https://github.com/NishithPaul/tWLSA .
format	article
author	Nishith Kumar Md. Aminul Hoque Masahiro Sugimoto
author_facet	Nishith Kumar Md. Aminul Hoque Masahiro Sugimoto
author_sort	Nishith Kumar
title	Kernel weighted least square approach for imputing missing values of metabolomics data
title_short	Kernel weighted least square approach for imputing missing values of metabolomics data
title_full	Kernel weighted least square approach for imputing missing values of metabolomics data
title_fullStr	Kernel weighted least square approach for imputing missing values of metabolomics data
title_full_unstemmed	Kernel weighted least square approach for imputing missing values of metabolomics data
title_sort	kernel weighted least square approach for imputing missing values of metabolomics data
publisher	Nature Portfolio
publishDate	2021
url	https://doaj.org/article/962584cc682c494ab4a1c5087b3765f6
work_keys_str_mv	AT nishithkumar kernelweightedleastsquareapproachforimputingmissingvaluesofmetabolomicsdata AT mdaminulhoque kernelweightedleastsquareapproachforimputingmissingvaluesofmetabolomicsdata AT masahirosugimoto kernelweightedleastsquareapproachforimputingmissingvaluesofmetabolomicsdata
_version_	1718389524767178752

Kernel weighted least square approach for imputing missing values of metabolomics data

Ejemplares similares