Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma

Abstract Knowledge of 1p/19q-codeletion and IDH1/2 mutational status is necessary to interpret any investigational study of diffuse gliomas in the modern era. While DNA sequencing is the gold standard for determining IDH mutational status, genome-wide methylation arrays and gene expression profiling...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Nicholas Nuechterlein, Linda G. Shapiro, Eric C. Holland, Patrick J. Cimino
Formato: article
Lenguaje:EN
Publicado: BMC 2021
Materias:
IDH
Acceso en línea:https://doaj.org/article/7ae1cad55a3b4fa3835686acc7e09a84
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:7ae1cad55a3b4fa3835686acc7e09a84
record_format dspace
spelling oai:doaj.org-article:7ae1cad55a3b4fa3835686acc7e09a842021-12-05T12:07:41ZMachine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma10.1186/s40478-021-01295-32051-5960https://doaj.org/article/7ae1cad55a3b4fa3835686acc7e09a842021-12-01T00:00:00Zhttps://doi.org/10.1186/s40478-021-01295-3https://doaj.org/toc/2051-5960Abstract Knowledge of 1p/19q-codeletion and IDH1/2 mutational status is necessary to interpret any investigational study of diffuse gliomas in the modern era. While DNA sequencing is the gold standard for determining IDH mutational status, genome-wide methylation arrays and gene expression profiling have been used for surrogate mutational determination. Previous studies by our group suggest that 1p/19q-codeletion and IDH mutational status can be predicted by genome-wide somatic copy number alteration (SCNA) data alone, however a rigorous model to accomplish this task has yet to be established. In this study, we used SCNA data from 786 adult diffuse gliomas in The Cancer Genome Atlas (TCGA) to develop a two-stage classification system that identifies 1p/19q-codeleted oligodendrogliomas and predicts the IDH mutational status of astrocytic tumors using a machine-learning model. Cross-validated results on TCGA SCNA data showed near perfect classification results. Furthermore, our astrocytic IDH mutation model validated well on four additional datasets (AUC = 0.97, AUC = 0.99, AUC = 0.95, AUC = 0.96) as did our 1p/19q-codeleted oligodendroglioma screen on the two datasets that contained oligodendrogliomas (MCC = 0.97, MCC = 0.97). We then retrained our system using data from these validation sets and applied our system to a cohort of REMBRANDT study subjects for whom SCNA data, but not IDH mutational status, is available. Overall, using genome-wide SCNAs, we successfully developed a system to robustly predict 1p/19q-codeletion and IDH mutational status in diffuse gliomas. This system can assign molecular subtype labels to tumor samples of retrospective diffuse glioma cohorts that lack 1p/19q-codeletion and IDH mutational status, such as the REMBRANDT study, recasting these datasets as validation cohorts for diffuse glioma research.Nicholas NuechterleinLinda G. ShapiroEric C. HollandPatrick J. CiminoBMCarticleAdult diffuse gliomaGlioblastomaAstrocytomaOligodendrogliomaCopy numberIDHNeurology. Diseases of the nervous systemRC346-429ENActa Neuropathologica Communications, Vol 9, Iss 1, Pp 1-18 (2021)
institution DOAJ
collection DOAJ
language EN
topic Adult diffuse glioma
Glioblastoma
Astrocytoma
Oligodendroglioma
Copy number
IDH
Neurology. Diseases of the nervous system
RC346-429
spellingShingle Adult diffuse glioma
Glioblastoma
Astrocytoma
Oligodendroglioma
Copy number
IDH
Neurology. Diseases of the nervous system
RC346-429
Nicholas Nuechterlein
Linda G. Shapiro
Eric C. Holland
Patrick J. Cimino
Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma
description Abstract Knowledge of 1p/19q-codeletion and IDH1/2 mutational status is necessary to interpret any investigational study of diffuse gliomas in the modern era. While DNA sequencing is the gold standard for determining IDH mutational status, genome-wide methylation arrays and gene expression profiling have been used for surrogate mutational determination. Previous studies by our group suggest that 1p/19q-codeletion and IDH mutational status can be predicted by genome-wide somatic copy number alteration (SCNA) data alone, however a rigorous model to accomplish this task has yet to be established. In this study, we used SCNA data from 786 adult diffuse gliomas in The Cancer Genome Atlas (TCGA) to develop a two-stage classification system that identifies 1p/19q-codeleted oligodendrogliomas and predicts the IDH mutational status of astrocytic tumors using a machine-learning model. Cross-validated results on TCGA SCNA data showed near perfect classification results. Furthermore, our astrocytic IDH mutation model validated well on four additional datasets (AUC = 0.97, AUC = 0.99, AUC = 0.95, AUC = 0.96) as did our 1p/19q-codeleted oligodendroglioma screen on the two datasets that contained oligodendrogliomas (MCC = 0.97, MCC = 0.97). We then retrained our system using data from these validation sets and applied our system to a cohort of REMBRANDT study subjects for whom SCNA data, but not IDH mutational status, is available. Overall, using genome-wide SCNAs, we successfully developed a system to robustly predict 1p/19q-codeletion and IDH mutational status in diffuse gliomas. This system can assign molecular subtype labels to tumor samples of retrospective diffuse glioma cohorts that lack 1p/19q-codeletion and IDH mutational status, such as the REMBRANDT study, recasting these datasets as validation cohorts for diffuse glioma research.
format article
author Nicholas Nuechterlein
Linda G. Shapiro
Eric C. Holland
Patrick J. Cimino
author_facet Nicholas Nuechterlein
Linda G. Shapiro
Eric C. Holland
Patrick J. Cimino
author_sort Nicholas Nuechterlein
title Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma
title_short Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma
title_full Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma
title_fullStr Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma
title_full_unstemmed Machine learning modeling of genome-wide copy number alteration signatures reliably predicts IDH mutational status in adult diffuse glioma
title_sort machine learning modeling of genome-wide copy number alteration signatures reliably predicts idh mutational status in adult diffuse glioma
publisher BMC
publishDate 2021
url https://doaj.org/article/7ae1cad55a3b4fa3835686acc7e09a84
work_keys_str_mv AT nicholasnuechterlein machinelearningmodelingofgenomewidecopynumberalterationsignaturesreliablypredictsidhmutationalstatusinadultdiffuseglioma
AT lindagshapiro machinelearningmodelingofgenomewidecopynumberalterationsignaturesreliablypredictsidhmutationalstatusinadultdiffuseglioma
AT ericcholland machinelearningmodelingofgenomewidecopynumberalterationsignaturesreliablypredictsidhmutationalstatusinadultdiffuseglioma
AT patrickjcimino machinelearningmodelingofgenomewidecopynumberalterationsignaturesreliablypredictsidhmutationalstatusinadultdiffuseglioma
_version_ 1718372201016590336