Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets

Abstract COVID-19 outbreak brings intense pressure on healthcare systems, with an urgent demand for effective diagnostic, prognostic and therapeutic procedures. Here, we employed Automated Machine Learning (AutoML) to analyze three publicly available high throughput COVID-19 datasets, including prot...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Georgios Papoutsoglou, Makrina Karaglani, Vincenzo Lagani, Naomi Thomson, Oluf Dimitri Røe, Ioannis Tsamardinos, Ekaterini Chatzaki
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/760ad289441a49f7b30b75092e2b03be
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:760ad289441a49f7b30b75092e2b03be
record_format dspace
spelling oai:doaj.org-article:760ad289441a49f7b30b75092e2b03be2021-12-02T17:55:03ZAutomated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets10.1038/s41598-021-94501-02045-2322https://doaj.org/article/760ad289441a49f7b30b75092e2b03be2021-07-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-94501-0https://doaj.org/toc/2045-2322Abstract COVID-19 outbreak brings intense pressure on healthcare systems, with an urgent demand for effective diagnostic, prognostic and therapeutic procedures. Here, we employed Automated Machine Learning (AutoML) to analyze three publicly available high throughput COVID-19 datasets, including proteomic, metabolomic and transcriptomic measurements. Pathway analysis of the selected features was also performed. Analysis of a combined proteomic and metabolomic dataset led to 10 equivalent signatures of two features each, with AUC 0.840 (CI 0.723–0.941) in discriminating severe from non-severe COVID-19 patients. A transcriptomic dataset led to two equivalent signatures of eight features each, with AUC 0.914 (CI 0.865–0.955) in identifying COVID-19 patients from those with a different acute respiratory illness. Another transcriptomic dataset led to two equivalent signatures of nine features each, with AUC 0.967 (CI 0.899–0.996) in identifying COVID-19 patients from virus-free individuals. Signature predictive performance remained high upon validation. Multiple new features emerged and pathway analysis revealed biological relevance by implication in Viral mRNA Translation, Interferon gamma signaling and Innate Immune System pathways. In conclusion, AutoML analysis led to multiple biosignatures of high predictive performance, with reduced features and large choice of alternative predictors. These favorable characteristics are eminent for development of cost-effective assays to contribute to better disease management.Georgios PapoutsoglouMakrina KaraglaniVincenzo LaganiNaomi ThomsonOluf Dimitri RøeIoannis TsamardinosEkaterini ChatzakiNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-13 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Georgios Papoutsoglou
Makrina Karaglani
Vincenzo Lagani
Naomi Thomson
Oluf Dimitri Røe
Ioannis Tsamardinos
Ekaterini Chatzaki
Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets
description Abstract COVID-19 outbreak brings intense pressure on healthcare systems, with an urgent demand for effective diagnostic, prognostic and therapeutic procedures. Here, we employed Automated Machine Learning (AutoML) to analyze three publicly available high throughput COVID-19 datasets, including proteomic, metabolomic and transcriptomic measurements. Pathway analysis of the selected features was also performed. Analysis of a combined proteomic and metabolomic dataset led to 10 equivalent signatures of two features each, with AUC 0.840 (CI 0.723–0.941) in discriminating severe from non-severe COVID-19 patients. A transcriptomic dataset led to two equivalent signatures of eight features each, with AUC 0.914 (CI 0.865–0.955) in identifying COVID-19 patients from those with a different acute respiratory illness. Another transcriptomic dataset led to two equivalent signatures of nine features each, with AUC 0.967 (CI 0.899–0.996) in identifying COVID-19 patients from virus-free individuals. Signature predictive performance remained high upon validation. Multiple new features emerged and pathway analysis revealed biological relevance by implication in Viral mRNA Translation, Interferon gamma signaling and Innate Immune System pathways. In conclusion, AutoML analysis led to multiple biosignatures of high predictive performance, with reduced features and large choice of alternative predictors. These favorable characteristics are eminent for development of cost-effective assays to contribute to better disease management.
format article
author Georgios Papoutsoglou
Makrina Karaglani
Vincenzo Lagani
Naomi Thomson
Oluf Dimitri Røe
Ioannis Tsamardinos
Ekaterini Chatzaki
author_facet Georgios Papoutsoglou
Makrina Karaglani
Vincenzo Lagani
Naomi Thomson
Oluf Dimitri Røe
Ioannis Tsamardinos
Ekaterini Chatzaki
author_sort Georgios Papoutsoglou
title Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets
title_short Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets
title_full Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets
title_fullStr Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets
title_full_unstemmed Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets
title_sort automated machine learning optimizes and accelerates predictive modeling from covid-19 high throughput datasets
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/760ad289441a49f7b30b75092e2b03be
work_keys_str_mv AT georgiospapoutsoglou automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets
AT makrinakaraglani automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets
AT vincenzolagani automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets
AT naomithomson automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets
AT olufdimitrirøe automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets
AT ioannistsamardinos automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets
AT ekaterinichatzaki automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets
_version_ 1718379174604832768