Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation

The Alzheimer’s Disease Neuroimaging Initiative (ADNI) contains extensive patient measurements (e.g., magnetic resonance imaging [MRI], biometrics, RNA expression, etc.) from Alzheimer’s disease (AD) cases and controls that have recently been used by machine learning algorithms to evaluate AD onset...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Erik D. Huckvale, Matthew W. Hodgman, Brianna B. Greenwood, Devorah O. Stucki, Katrisa M. Ward, Mark T. W. Ebbert, John S. K. Kauwe, The Alzheimer’s Disease Neuroimaging Initiative, The Alzheimer’s Disease Metabolomics Consortium, Justin B. Miller
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Acceso en línea:https://doaj.org/article/05cfb758642e4e2b8035bed340482946
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:05cfb758642e4e2b8035bed340482946
record_format dspace
spelling oai:doaj.org-article:05cfb758642e4e2b8035bed3404829462021-11-25T17:40:23ZPairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation10.3390/genes121116612073-4425https://doaj.org/article/05cfb758642e4e2b8035bed3404829462021-10-01T00:00:00Zhttps://www.mdpi.com/2073-4425/12/11/1661https://doaj.org/toc/2073-4425The Alzheimer’s Disease Neuroimaging Initiative (ADNI) contains extensive patient measurements (e.g., magnetic resonance imaging [MRI], biometrics, RNA expression, etc.) from Alzheimer’s disease (AD) cases and controls that have recently been used by machine learning algorithms to evaluate AD onset and progression. While using a variety of biomarkers is essential to AD research, highly correlated input features can significantly decrease machine learning model generalizability and performance. Additionally, redundant features unnecessarily increase computational time and resources necessary to train predictive models. Therefore, we used 49,288 biomarkers and 793,600 extracted MRI features to assess feature correlation within the ADNI dataset to determine the extent to which this issue might impact large scale analyses using these data. We found that 93.457% of biomarkers, 92.549% of the gene expression values, and 100% of MRI features were strongly correlated with at least one other feature in ADNI based on our Bonferroni corrected α (<i>p</i>-value ≤ 1.40754 × 10<sup>−13</sup>). We provide a comprehensive mapping of all ADNI biomarkers to highly correlated features within the dataset. Additionally, we show that significant correlation within the ADNI dataset should be resolved before performing bulk data analyses, and we provide recommendations to address these issues. We anticipate that these recommendations and resources will help guide researchers utilizing the ADNI dataset to increase model performance and reduce the cost and complexity of their analyses.Erik D. HuckvaleMatthew W. HodgmanBrianna B. GreenwoodDevorah O. StuckiKatrisa M. WardMark T. W. EbbertJohn S. K. KauweThe Alzheimer’s Disease Neuroimaging InitiativeThe Alzheimer’s Disease Metabolomics ConsortiumJustin B. MillerMDPI AGarticleADNIpairwise feature correlationfeature reductionmachine learningAlzheimer’s diseaseGeneticsQH426-470ENGenes, Vol 12, Iss 1661, p 1661 (2021)
institution DOAJ
collection DOAJ
language EN
topic ADNI
pairwise feature correlation
feature reduction
machine learning
Alzheimer’s disease
Genetics
QH426-470
spellingShingle ADNI
pairwise feature correlation
feature reduction
machine learning
Alzheimer’s disease
Genetics
QH426-470
Erik D. Huckvale
Matthew W. Hodgman
Brianna B. Greenwood
Devorah O. Stucki
Katrisa M. Ward
Mark T. W. Ebbert
John S. K. Kauwe
The Alzheimer’s Disease Neuroimaging Initiative
The Alzheimer’s Disease Metabolomics Consortium
Justin B. Miller
Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation
description The Alzheimer’s Disease Neuroimaging Initiative (ADNI) contains extensive patient measurements (e.g., magnetic resonance imaging [MRI], biometrics, RNA expression, etc.) from Alzheimer’s disease (AD) cases and controls that have recently been used by machine learning algorithms to evaluate AD onset and progression. While using a variety of biomarkers is essential to AD research, highly correlated input features can significantly decrease machine learning model generalizability and performance. Additionally, redundant features unnecessarily increase computational time and resources necessary to train predictive models. Therefore, we used 49,288 biomarkers and 793,600 extracted MRI features to assess feature correlation within the ADNI dataset to determine the extent to which this issue might impact large scale analyses using these data. We found that 93.457% of biomarkers, 92.549% of the gene expression values, and 100% of MRI features were strongly correlated with at least one other feature in ADNI based on our Bonferroni corrected α (<i>p</i>-value ≤ 1.40754 × 10<sup>−13</sup>). We provide a comprehensive mapping of all ADNI biomarkers to highly correlated features within the dataset. Additionally, we show that significant correlation within the ADNI dataset should be resolved before performing bulk data analyses, and we provide recommendations to address these issues. We anticipate that these recommendations and resources will help guide researchers utilizing the ADNI dataset to increase model performance and reduce the cost and complexity of their analyses.
format article
author Erik D. Huckvale
Matthew W. Hodgman
Brianna B. Greenwood
Devorah O. Stucki
Katrisa M. Ward
Mark T. W. Ebbert
John S. K. Kauwe
The Alzheimer’s Disease Neuroimaging Initiative
The Alzheimer’s Disease Metabolomics Consortium
Justin B. Miller
author_facet Erik D. Huckvale
Matthew W. Hodgman
Brianna B. Greenwood
Devorah O. Stucki
Katrisa M. Ward
Mark T. W. Ebbert
John S. K. Kauwe
The Alzheimer’s Disease Neuroimaging Initiative
The Alzheimer’s Disease Metabolomics Consortium
Justin B. Miller
author_sort Erik D. Huckvale
title Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation
title_short Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation
title_full Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation
title_fullStr Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation
title_full_unstemmed Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation
title_sort pairwise correlation analysis of the alzheimer’s disease neuroimaging initiative (adni) dataset reveals significant feature correlation
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/05cfb758642e4e2b8035bed340482946
work_keys_str_mv AT erikdhuckvale pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation
AT matthewwhodgman pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation
AT briannabgreenwood pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation
AT devorahostucki pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation
AT katrisamward pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation
AT marktwebbert pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation
AT johnskkauwe pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation
AT thealzheimersdiseaseneuroimaginginitiative pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation
AT thealzheimersdiseasemetabolomicsconsortium pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation
AT justinbmiller pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation
_version_ 1718412086044786688