Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data.

We investigate the feasibility of molecular-level sample classification of sepsis using microarray gene expression data merged by in silico meta-analysis. Publicly available data series were extracted from NCBI Gene Expression Omnibus and EMBL-EBI ArrayExpress to create a comprehensive meta-analysis...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Dominik Schaack, Markus A Weigand, Florian Uhle
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/229cd1e057ca4c62ac611b4e7b29f9ea
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:229cd1e057ca4c62ac611b4e7b29f9ea
record_format dspace
spelling oai:doaj.org-article:229cd1e057ca4c62ac611b4e7b29f9ea2021-11-25T06:19:10ZComparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data.1932-620310.1371/journal.pone.0251800https://doaj.org/article/229cd1e057ca4c62ac611b4e7b29f9ea2021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0251800https://doaj.org/toc/1932-6203We investigate the feasibility of molecular-level sample classification of sepsis using microarray gene expression data merged by in silico meta-analysis. Publicly available data series were extracted from NCBI Gene Expression Omnibus and EMBL-EBI ArrayExpress to create a comprehensive meta-analysis microarray expression set (meta-expression set). Measurements had to be obtained via microarray-technique from whole blood samples of adult or pediatric patients with sepsis diagnosed based on international consensus definition immediately after admission to the intensive care unit. We aggregate trauma patients, systemic inflammatory response syndrome (SIRS) patients, and healthy controls in a non-septic entity. Differential expression (DE) analysis is compared with machine-learning-based solutions like decision tree (DT), random forest (RF), support vector machine (SVM), and deep-learning neural networks (DNNs). We evaluated classifier training and discrimination performance in 100 independent iterations. To test diagnostic resilience, we gradually degraded expression data in multiple levels. Clustering of expression values based on DE genes results in partial identification of sepsis samples. In contrast, RF, SVM, and DNN provide excellent diagnostic performance measured in terms of accuracy and area under the curve (>0.96 and >0.99, respectively). We prove DNNs as the most resilient methodology, virtually unaffected by targeted removal of DE genes. By surpassing most other published solutions, the presented approach substantially augments current diagnostic capability in intensive care medicine.Dominik SchaackMarkus A WeigandFlorian UhlePublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 5, p e0251800 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Dominik Schaack
Markus A Weigand
Florian Uhle
Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data.
description We investigate the feasibility of molecular-level sample classification of sepsis using microarray gene expression data merged by in silico meta-analysis. Publicly available data series were extracted from NCBI Gene Expression Omnibus and EMBL-EBI ArrayExpress to create a comprehensive meta-analysis microarray expression set (meta-expression set). Measurements had to be obtained via microarray-technique from whole blood samples of adult or pediatric patients with sepsis diagnosed based on international consensus definition immediately after admission to the intensive care unit. We aggregate trauma patients, systemic inflammatory response syndrome (SIRS) patients, and healthy controls in a non-septic entity. Differential expression (DE) analysis is compared with machine-learning-based solutions like decision tree (DT), random forest (RF), support vector machine (SVM), and deep-learning neural networks (DNNs). We evaluated classifier training and discrimination performance in 100 independent iterations. To test diagnostic resilience, we gradually degraded expression data in multiple levels. Clustering of expression values based on DE genes results in partial identification of sepsis samples. In contrast, RF, SVM, and DNN provide excellent diagnostic performance measured in terms of accuracy and area under the curve (>0.96 and >0.99, respectively). We prove DNNs as the most resilient methodology, virtually unaffected by targeted removal of DE genes. By surpassing most other published solutions, the presented approach substantially augments current diagnostic capability in intensive care medicine.
format article
author Dominik Schaack
Markus A Weigand
Florian Uhle
author_facet Dominik Schaack
Markus A Weigand
Florian Uhle
author_sort Dominik Schaack
title Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data.
title_short Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data.
title_full Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data.
title_fullStr Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data.
title_full_unstemmed Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data.
title_sort comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/229cd1e057ca4c62ac611b4e7b29f9ea
work_keys_str_mv AT dominikschaack comparisonofmachinelearningmethodologiesforaccuratediagnosisofsepsisusingmicroarraygeneexpressiondata
AT markusaweigand comparisonofmachinelearningmethodologiesforaccuratediagnosisofsepsisusingmicroarraygeneexpressiondata
AT florianuhle comparisonofmachinelearningmethodologiesforaccuratediagnosisofsepsisusingmicroarraygeneexpressiondata
_version_ 1718413911289495552