Comparison of machine learning clustering algorithms for detecting heterogeneity of treatment effect in acute respiratory distress syndrome: A secondary analysis of three randomised controlled trials

Background: Heterogeneity in Acute Respiratory Distress Syndrome (ARDS), as a consequence of its non-specific definition, has led to a multitude of negative randomised controlled trials (RCTs). Investigators have sought to identify heterogeneity of treatment effect (HTE) in RCTs using clustering alg...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Pratik Sinha, Alexandra Spicer, Kevin L Delucchi, Daniel F McAuley, Carolyn S Calfee, Matthew M Churpek
Formato:	article
Lenguaje:	EN
Publicado:	Elsevier 2021
Materias:	ARDS RCTs Clustering machine learning LCA Heterogeneity of treatment effect Medicine R Medicine (General) R5-920
Acceso en línea:	https://doaj.org/article/dcd0c2641c8d430e8bbc69ed3dea44b9
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:dcd0c2641c8d430e8bbc69ed3dea44b9
record_format	dspace
spelling	oai:doaj.org-article:dcd0c2641c8d430e8bbc69ed3dea44b92021-12-02T05:01:54ZComparison of machine learning clustering algorithms for detecting heterogeneity of treatment effect in acute respiratory distress syndrome: A secondary analysis of three randomised controlled trials2352-396410.1016/j.ebiom.2021.103697https://doaj.org/article/dcd0c2641c8d430e8bbc69ed3dea44b92021-12-01T00:00:00Zhttp://www.sciencedirect.com/science/article/pii/S2352396421004916https://doaj.org/toc/2352-3964Background: Heterogeneity in Acute Respiratory Distress Syndrome (ARDS), as a consequence of its non-specific definition, has led to a multitude of negative randomised controlled trials (RCTs). Investigators have sought to identify heterogeneity of treatment effect (HTE) in RCTs using clustering algorithms. We evaluated the proficiency of several commonly-used machine-learning algorithms to identify clusters where HTE may be detected. Methods: Five unsupervised: Latent class analysis (LCA), K-means, partition around medoids, hierarchical, and spectral clustering; and four supervised algorithms: model-based recursive partitioning, Causal Forest (CF), and X-learner with Random Forest (XL-RF) and Bayesian Additive Regression Trees were individually applied to three prior ARDS RCTs. Clinical data and research protein biomarkers were used as partitioning variables, with the latter excluded for secondary analyses. For a clustering schema, HTE was evaluated based on the interaction term of treatment group and cluster with day-90 mortality as the dependent variable. Findings: No single algorithm identified clusters with significant HTE in all three trials. LCA, XL-RF, and CF identified HTE most frequently (2/3 RCTs). Important partitioning variables in the unsupervised approaches were consistent across algorithms and RCTs. In supervised models, important partitioning variables varied between algorithms and across RCTs. In algorithms where clusters demonstrated HTE in the same trial, patients frequently interchanged clusters from treatment-benefit to treatment-harm clusters across algorithms. LCA aside, results from all other algorithms were subject to significant alteration in cluster composition and HTE with random seed change. Removing research biomarkers as partitioning variables greatly reduced the chances of detecting HTE across all algorithms. Interpretation: Machine-learning algorithms were inconsistent in their abilities to identify clusters with significant HTE. Protein biomarkers were essential in identifying clusters with HTE. Investigations using machine-learning approaches to identify clusters to seek HTE require cautious interpretation. Funding: NIGMS R35 GM142992 (PS), NHLBI R35 HL140026 (CSC); NIGMS R01 GM123193, Department of Defense W81XWH-21-1-0009, NIA R21 AG068720, NIDA R01 DA051464 (MMC)Pratik SinhaAlexandra SpicerKevin L DelucchiDaniel F McAuleyCarolyn S CalfeeMatthew M ChurpekElsevierarticleARDSRCTsClusteringmachine learningLCAHeterogeneity of treatment effectMedicineRMedicine (General)R5-920ENEBioMedicine, Vol 74, Iss , Pp 103697- (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	ARDS RCTs Clustering machine learning LCA Heterogeneity of treatment effect Medicine R Medicine (General) R5-920
spellingShingle	ARDS RCTs Clustering machine learning LCA Heterogeneity of treatment effect Medicine R Medicine (General) R5-920 Pratik Sinha Alexandra Spicer Kevin L Delucchi Daniel F McAuley Carolyn S Calfee Matthew M Churpek Comparison of machine learning clustering algorithms for detecting heterogeneity of treatment effect in acute respiratory distress syndrome: A secondary analysis of three randomised controlled trials
description	Background: Heterogeneity in Acute Respiratory Distress Syndrome (ARDS), as a consequence of its non-specific definition, has led to a multitude of negative randomised controlled trials (RCTs). Investigators have sought to identify heterogeneity of treatment effect (HTE) in RCTs using clustering algorithms. We evaluated the proficiency of several commonly-used machine-learning algorithms to identify clusters where HTE may be detected. Methods: Five unsupervised: Latent class analysis (LCA), K-means, partition around medoids, hierarchical, and spectral clustering; and four supervised algorithms: model-based recursive partitioning, Causal Forest (CF), and X-learner with Random Forest (XL-RF) and Bayesian Additive Regression Trees were individually applied to three prior ARDS RCTs. Clinical data and research protein biomarkers were used as partitioning variables, with the latter excluded for secondary analyses. For a clustering schema, HTE was evaluated based on the interaction term of treatment group and cluster with day-90 mortality as the dependent variable. Findings: No single algorithm identified clusters with significant HTE in all three trials. LCA, XL-RF, and CF identified HTE most frequently (2/3 RCTs). Important partitioning variables in the unsupervised approaches were consistent across algorithms and RCTs. In supervised models, important partitioning variables varied between algorithms and across RCTs. In algorithms where clusters demonstrated HTE in the same trial, patients frequently interchanged clusters from treatment-benefit to treatment-harm clusters across algorithms. LCA aside, results from all other algorithms were subject to significant alteration in cluster composition and HTE with random seed change. Removing research biomarkers as partitioning variables greatly reduced the chances of detecting HTE across all algorithms. Interpretation: Machine-learning algorithms were inconsistent in their abilities to identify clusters with significant HTE. Protein biomarkers were essential in identifying clusters with HTE. Investigations using machine-learning approaches to identify clusters to seek HTE require cautious interpretation. Funding: NIGMS R35 GM142992 (PS), NHLBI R35 HL140026 (CSC); NIGMS R01 GM123193, Department of Defense W81XWH-21-1-0009, NIA R21 AG068720, NIDA R01 DA051464 (MMC)
format	article
author	Pratik Sinha Alexandra Spicer Kevin L Delucchi Daniel F McAuley Carolyn S Calfee Matthew M Churpek
author_facet	Pratik Sinha Alexandra Spicer Kevin L Delucchi Daniel F McAuley Carolyn S Calfee Matthew M Churpek
author_sort	Pratik Sinha
title	Comparison of machine learning clustering algorithms for detecting heterogeneity of treatment effect in acute respiratory distress syndrome: A secondary analysis of three randomised controlled trials
title_short	Comparison of machine learning clustering algorithms for detecting heterogeneity of treatment effect in acute respiratory distress syndrome: A secondary analysis of three randomised controlled trials
title_full	Comparison of machine learning clustering algorithms for detecting heterogeneity of treatment effect in acute respiratory distress syndrome: A secondary analysis of three randomised controlled trials
title_fullStr	Comparison of machine learning clustering algorithms for detecting heterogeneity of treatment effect in acute respiratory distress syndrome: A secondary analysis of three randomised controlled trials
title_full_unstemmed	Comparison of machine learning clustering algorithms for detecting heterogeneity of treatment effect in acute respiratory distress syndrome: A secondary analysis of three randomised controlled trials
title_sort	comparison of machine learning clustering algorithms for detecting heterogeneity of treatment effect in acute respiratory distress syndrome: a secondary analysis of three randomised controlled trials
publisher	Elsevier
publishDate	2021
url	https://doaj.org/article/dcd0c2641c8d430e8bbc69ed3dea44b9
work_keys_str_mv	AT pratiksinha comparisonofmachinelearningclusteringalgorithmsfordetectingheterogeneityoftreatmenteffectinacuterespiratorydistresssyndromeasecondaryanalysisofthreerandomisedcontrolledtrials AT alexandraspicer comparisonofmachinelearningclusteringalgorithmsfordetectingheterogeneityoftreatmenteffectinacuterespiratorydistresssyndromeasecondaryanalysisofthreerandomisedcontrolledtrials AT kevinldelucchi comparisonofmachinelearningclusteringalgorithmsfordetectingheterogeneityoftreatmenteffectinacuterespiratorydistresssyndromeasecondaryanalysisofthreerandomisedcontrolledtrials AT danielfmcauley comparisonofmachinelearningclusteringalgorithmsfordetectingheterogeneityoftreatmenteffectinacuterespiratorydistresssyndromeasecondaryanalysisofthreerandomisedcontrolledtrials AT carolynscalfee comparisonofmachinelearningclusteringalgorithmsfordetectingheterogeneityoftreatmenteffectinacuterespiratorydistresssyndromeasecondaryanalysisofthreerandomisedcontrolledtrials AT matthewmchurpek comparisonofmachinelearningclusteringalgorithmsfordetectingheterogeneityoftreatmenteffectinacuterespiratorydistresssyndromeasecondaryanalysisofthreerandomisedcontrolledtrials
_version_	1718400799835422720

Comparison of machine learning clustering algorithms for detecting heterogeneity of treatment effect in acute respiratory distress syndrome: A secondary analysis of three randomised controlled trials

Ejemplares similares