Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets
Abstract COVID-19 outbreak brings intense pressure on healthcare systems, with an urgent demand for effective diagnostic, prognostic and therapeutic procedures. Here, we employed Automated Machine Learning (AutoML) to analyze three publicly available high throughput COVID-19 datasets, including prot...
Guardado en:
Autores principales: | , , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Nature Portfolio
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/760ad289441a49f7b30b75092e2b03be |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:760ad289441a49f7b30b75092e2b03be |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:760ad289441a49f7b30b75092e2b03be2021-12-02T17:55:03ZAutomated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets10.1038/s41598-021-94501-02045-2322https://doaj.org/article/760ad289441a49f7b30b75092e2b03be2021-07-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-94501-0https://doaj.org/toc/2045-2322Abstract COVID-19 outbreak brings intense pressure on healthcare systems, with an urgent demand for effective diagnostic, prognostic and therapeutic procedures. Here, we employed Automated Machine Learning (AutoML) to analyze three publicly available high throughput COVID-19 datasets, including proteomic, metabolomic and transcriptomic measurements. Pathway analysis of the selected features was also performed. Analysis of a combined proteomic and metabolomic dataset led to 10 equivalent signatures of two features each, with AUC 0.840 (CI 0.723–0.941) in discriminating severe from non-severe COVID-19 patients. A transcriptomic dataset led to two equivalent signatures of eight features each, with AUC 0.914 (CI 0.865–0.955) in identifying COVID-19 patients from those with a different acute respiratory illness. Another transcriptomic dataset led to two equivalent signatures of nine features each, with AUC 0.967 (CI 0.899–0.996) in identifying COVID-19 patients from virus-free individuals. Signature predictive performance remained high upon validation. Multiple new features emerged and pathway analysis revealed biological relevance by implication in Viral mRNA Translation, Interferon gamma signaling and Innate Immune System pathways. In conclusion, AutoML analysis led to multiple biosignatures of high predictive performance, with reduced features and large choice of alternative predictors. These favorable characteristics are eminent for development of cost-effective assays to contribute to better disease management.Georgios PapoutsoglouMakrina KaraglaniVincenzo LaganiNaomi ThomsonOluf Dimitri RøeIoannis TsamardinosEkaterini ChatzakiNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-13 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Georgios Papoutsoglou Makrina Karaglani Vincenzo Lagani Naomi Thomson Oluf Dimitri Røe Ioannis Tsamardinos Ekaterini Chatzaki Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets |
description |
Abstract COVID-19 outbreak brings intense pressure on healthcare systems, with an urgent demand for effective diagnostic, prognostic and therapeutic procedures. Here, we employed Automated Machine Learning (AutoML) to analyze three publicly available high throughput COVID-19 datasets, including proteomic, metabolomic and transcriptomic measurements. Pathway analysis of the selected features was also performed. Analysis of a combined proteomic and metabolomic dataset led to 10 equivalent signatures of two features each, with AUC 0.840 (CI 0.723–0.941) in discriminating severe from non-severe COVID-19 patients. A transcriptomic dataset led to two equivalent signatures of eight features each, with AUC 0.914 (CI 0.865–0.955) in identifying COVID-19 patients from those with a different acute respiratory illness. Another transcriptomic dataset led to two equivalent signatures of nine features each, with AUC 0.967 (CI 0.899–0.996) in identifying COVID-19 patients from virus-free individuals. Signature predictive performance remained high upon validation. Multiple new features emerged and pathway analysis revealed biological relevance by implication in Viral mRNA Translation, Interferon gamma signaling and Innate Immune System pathways. In conclusion, AutoML analysis led to multiple biosignatures of high predictive performance, with reduced features and large choice of alternative predictors. These favorable characteristics are eminent for development of cost-effective assays to contribute to better disease management. |
format |
article |
author |
Georgios Papoutsoglou Makrina Karaglani Vincenzo Lagani Naomi Thomson Oluf Dimitri Røe Ioannis Tsamardinos Ekaterini Chatzaki |
author_facet |
Georgios Papoutsoglou Makrina Karaglani Vincenzo Lagani Naomi Thomson Oluf Dimitri Røe Ioannis Tsamardinos Ekaterini Chatzaki |
author_sort |
Georgios Papoutsoglou |
title |
Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets |
title_short |
Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets |
title_full |
Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets |
title_fullStr |
Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets |
title_full_unstemmed |
Automated machine learning optimizes and accelerates predictive modeling from COVID-19 high throughput datasets |
title_sort |
automated machine learning optimizes and accelerates predictive modeling from covid-19 high throughput datasets |
publisher |
Nature Portfolio |
publishDate |
2021 |
url |
https://doaj.org/article/760ad289441a49f7b30b75092e2b03be |
work_keys_str_mv |
AT georgiospapoutsoglou automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets AT makrinakaraglani automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets AT vincenzolagani automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets AT naomithomson automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets AT olufdimitrirøe automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets AT ioannistsamardinos automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets AT ekaterinichatzaki automatedmachinelearningoptimizesandacceleratespredictivemodelingfromcovid19highthroughputdatasets |
_version_ |
1718379174604832768 |