A comparison of time to event analysis methods, using weight status and breast cancer as a case study

Abstract Survival analysis with cohort study data has been traditionally performed using Cox proportional hazards models. Random survival forests (RSFs), a machine learning method, now present an alternative method. Using the UK Women’s Cohort Study (n = 34,493) we evaluate two methods: a Cox model...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Georgios Aivaliotis, Jan Palczewski, Rebecca Atkinson, Janet E. Cade, Michelle A. Morris
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/bdff11217d9a40ce95f2803a85af8a03
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:bdff11217d9a40ce95f2803a85af8a03
record_format dspace
spelling oai:doaj.org-article:bdff11217d9a40ce95f2803a85af8a032021-12-02T18:34:13ZA comparison of time to event analysis methods, using weight status and breast cancer as a case study10.1038/s41598-021-92944-z2045-2322https://doaj.org/article/bdff11217d9a40ce95f2803a85af8a032021-07-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-92944-zhttps://doaj.org/toc/2045-2322Abstract Survival analysis with cohort study data has been traditionally performed using Cox proportional hazards models. Random survival forests (RSFs), a machine learning method, now present an alternative method. Using the UK Women’s Cohort Study (n = 34,493) we evaluate two methods: a Cox model and an RSF, to investigate the association between Body Mass Index and time to breast cancer incidence. Robustness of the models were assessed by cross validation and bootstraping. Histograms of bootstrap coefficients are reported. C-Indices and Integrated Brier Scores are reported for all models. In post-menopausal women, the Cox model Hazard Ratios (HR) for Overweight (OW) and Obese (O) were 1.25 (1.04, 1.51) and 1.28 (0.98, 1.68) respectively and the RSF Odds Ratios (OR) with partial dependence on menopause for OW and O were 1.34 (1.31, 1.70) and 1.45 (1.42, 1.48). HR are non-significant results. Only the RSF appears confident about the effect of weight status on time to event. Bootstrapping demonstrated Cox model coefficients can vary significantly, weakening interpretation potential. An RSF was used to produce partial dependence plots (PDPs) showing OW and O weight status increase the probability of breast cancer incidence in post-menopausal women. All models have relatively low C-Index and high Integrated Brier Score. The RSF overfits the data. In our study, RSF can identify complex non-proportional hazard type patterns in the data, and allow more complicated relationships to be investigated using PDPs, but it overfits limiting extrapolation of results to new instances. Moreover, it is less easily interpreted than Cox models. The value of survival analysis remains paramount and therefore machine learning techniques like RSF should be considered as another method for analysis.Georgios AivaliotisJan PalczewskiRebecca AtkinsonJanet E. CadeMichelle A. MorrisNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-9 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Georgios Aivaliotis
Jan Palczewski
Rebecca Atkinson
Janet E. Cade
Michelle A. Morris
A comparison of time to event analysis methods, using weight status and breast cancer as a case study
description Abstract Survival analysis with cohort study data has been traditionally performed using Cox proportional hazards models. Random survival forests (RSFs), a machine learning method, now present an alternative method. Using the UK Women’s Cohort Study (n = 34,493) we evaluate two methods: a Cox model and an RSF, to investigate the association between Body Mass Index and time to breast cancer incidence. Robustness of the models were assessed by cross validation and bootstraping. Histograms of bootstrap coefficients are reported. C-Indices and Integrated Brier Scores are reported for all models. In post-menopausal women, the Cox model Hazard Ratios (HR) for Overweight (OW) and Obese (O) were 1.25 (1.04, 1.51) and 1.28 (0.98, 1.68) respectively and the RSF Odds Ratios (OR) with partial dependence on menopause for OW and O were 1.34 (1.31, 1.70) and 1.45 (1.42, 1.48). HR are non-significant results. Only the RSF appears confident about the effect of weight status on time to event. Bootstrapping demonstrated Cox model coefficients can vary significantly, weakening interpretation potential. An RSF was used to produce partial dependence plots (PDPs) showing OW and O weight status increase the probability of breast cancer incidence in post-menopausal women. All models have relatively low C-Index and high Integrated Brier Score. The RSF overfits the data. In our study, RSF can identify complex non-proportional hazard type patterns in the data, and allow more complicated relationships to be investigated using PDPs, but it overfits limiting extrapolation of results to new instances. Moreover, it is less easily interpreted than Cox models. The value of survival analysis remains paramount and therefore machine learning techniques like RSF should be considered as another method for analysis.
format article
author Georgios Aivaliotis
Jan Palczewski
Rebecca Atkinson
Janet E. Cade
Michelle A. Morris
author_facet Georgios Aivaliotis
Jan Palczewski
Rebecca Atkinson
Janet E. Cade
Michelle A. Morris
author_sort Georgios Aivaliotis
title A comparison of time to event analysis methods, using weight status and breast cancer as a case study
title_short A comparison of time to event analysis methods, using weight status and breast cancer as a case study
title_full A comparison of time to event analysis methods, using weight status and breast cancer as a case study
title_fullStr A comparison of time to event analysis methods, using weight status and breast cancer as a case study
title_full_unstemmed A comparison of time to event analysis methods, using weight status and breast cancer as a case study
title_sort comparison of time to event analysis methods, using weight status and breast cancer as a case study
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/bdff11217d9a40ce95f2803a85af8a03
work_keys_str_mv AT georgiosaivaliotis acomparisonoftimetoeventanalysismethodsusingweightstatusandbreastcancerasacasestudy
AT janpalczewski acomparisonoftimetoeventanalysismethodsusingweightstatusandbreastcancerasacasestudy
AT rebeccaatkinson acomparisonoftimetoeventanalysismethodsusingweightstatusandbreastcancerasacasestudy
AT janetecade acomparisonoftimetoeventanalysismethodsusingweightstatusandbreastcancerasacasestudy
AT michelleamorris acomparisonoftimetoeventanalysismethodsusingweightstatusandbreastcancerasacasestudy
AT georgiosaivaliotis comparisonoftimetoeventanalysismethodsusingweightstatusandbreastcancerasacasestudy
AT janpalczewski comparisonoftimetoeventanalysismethodsusingweightstatusandbreastcancerasacasestudy
AT rebeccaatkinson comparisonoftimetoeventanalysismethodsusingweightstatusandbreastcancerasacasestudy
AT janetecade comparisonoftimetoeventanalysismethodsusingweightstatusandbreastcancerasacasestudy
AT michelleamorris comparisonoftimetoeventanalysismethodsusingweightstatusandbreastcancerasacasestudy
_version_ 1718377884283830272