Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation
The Alzheimer’s Disease Neuroimaging Initiative (ADNI) contains extensive patient measurements (e.g., magnetic resonance imaging [MRI], biometrics, RNA expression, etc.) from Alzheimer’s disease (AD) cases and controls that have recently been used by machine learning algorithms to evaluate AD onset...
Guardado en:
Autores principales: | , , , , , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
MDPI AG
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/05cfb758642e4e2b8035bed340482946 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:05cfb758642e4e2b8035bed340482946 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:05cfb758642e4e2b8035bed3404829462021-11-25T17:40:23ZPairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation10.3390/genes121116612073-4425https://doaj.org/article/05cfb758642e4e2b8035bed3404829462021-10-01T00:00:00Zhttps://www.mdpi.com/2073-4425/12/11/1661https://doaj.org/toc/2073-4425The Alzheimer’s Disease Neuroimaging Initiative (ADNI) contains extensive patient measurements (e.g., magnetic resonance imaging [MRI], biometrics, RNA expression, etc.) from Alzheimer’s disease (AD) cases and controls that have recently been used by machine learning algorithms to evaluate AD onset and progression. While using a variety of biomarkers is essential to AD research, highly correlated input features can significantly decrease machine learning model generalizability and performance. Additionally, redundant features unnecessarily increase computational time and resources necessary to train predictive models. Therefore, we used 49,288 biomarkers and 793,600 extracted MRI features to assess feature correlation within the ADNI dataset to determine the extent to which this issue might impact large scale analyses using these data. We found that 93.457% of biomarkers, 92.549% of the gene expression values, and 100% of MRI features were strongly correlated with at least one other feature in ADNI based on our Bonferroni corrected α (<i>p</i>-value ≤ 1.40754 × 10<sup>−13</sup>). We provide a comprehensive mapping of all ADNI biomarkers to highly correlated features within the dataset. Additionally, we show that significant correlation within the ADNI dataset should be resolved before performing bulk data analyses, and we provide recommendations to address these issues. We anticipate that these recommendations and resources will help guide researchers utilizing the ADNI dataset to increase model performance and reduce the cost and complexity of their analyses.Erik D. HuckvaleMatthew W. HodgmanBrianna B. GreenwoodDevorah O. StuckiKatrisa M. WardMark T. W. EbbertJohn S. K. KauweThe Alzheimer’s Disease Neuroimaging InitiativeThe Alzheimer’s Disease Metabolomics ConsortiumJustin B. MillerMDPI AGarticleADNIpairwise feature correlationfeature reductionmachine learningAlzheimer’s diseaseGeneticsQH426-470ENGenes, Vol 12, Iss 1661, p 1661 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
ADNI pairwise feature correlation feature reduction machine learning Alzheimer’s disease Genetics QH426-470 |
spellingShingle |
ADNI pairwise feature correlation feature reduction machine learning Alzheimer’s disease Genetics QH426-470 Erik D. Huckvale Matthew W. Hodgman Brianna B. Greenwood Devorah O. Stucki Katrisa M. Ward Mark T. W. Ebbert John S. K. Kauwe The Alzheimer’s Disease Neuroimaging Initiative The Alzheimer’s Disease Metabolomics Consortium Justin B. Miller Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation |
description |
The Alzheimer’s Disease Neuroimaging Initiative (ADNI) contains extensive patient measurements (e.g., magnetic resonance imaging [MRI], biometrics, RNA expression, etc.) from Alzheimer’s disease (AD) cases and controls that have recently been used by machine learning algorithms to evaluate AD onset and progression. While using a variety of biomarkers is essential to AD research, highly correlated input features can significantly decrease machine learning model generalizability and performance. Additionally, redundant features unnecessarily increase computational time and resources necessary to train predictive models. Therefore, we used 49,288 biomarkers and 793,600 extracted MRI features to assess feature correlation within the ADNI dataset to determine the extent to which this issue might impact large scale analyses using these data. We found that 93.457% of biomarkers, 92.549% of the gene expression values, and 100% of MRI features were strongly correlated with at least one other feature in ADNI based on our Bonferroni corrected α (<i>p</i>-value ≤ 1.40754 × 10<sup>−13</sup>). We provide a comprehensive mapping of all ADNI biomarkers to highly correlated features within the dataset. Additionally, we show that significant correlation within the ADNI dataset should be resolved before performing bulk data analyses, and we provide recommendations to address these issues. We anticipate that these recommendations and resources will help guide researchers utilizing the ADNI dataset to increase model performance and reduce the cost and complexity of their analyses. |
format |
article |
author |
Erik D. Huckvale Matthew W. Hodgman Brianna B. Greenwood Devorah O. Stucki Katrisa M. Ward Mark T. W. Ebbert John S. K. Kauwe The Alzheimer’s Disease Neuroimaging Initiative The Alzheimer’s Disease Metabolomics Consortium Justin B. Miller |
author_facet |
Erik D. Huckvale Matthew W. Hodgman Brianna B. Greenwood Devorah O. Stucki Katrisa M. Ward Mark T. W. Ebbert John S. K. Kauwe The Alzheimer’s Disease Neuroimaging Initiative The Alzheimer’s Disease Metabolomics Consortium Justin B. Miller |
author_sort |
Erik D. Huckvale |
title |
Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation |
title_short |
Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation |
title_full |
Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation |
title_fullStr |
Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation |
title_full_unstemmed |
Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation |
title_sort |
pairwise correlation analysis of the alzheimer’s disease neuroimaging initiative (adni) dataset reveals significant feature correlation |
publisher |
MDPI AG |
publishDate |
2021 |
url |
https://doaj.org/article/05cfb758642e4e2b8035bed340482946 |
work_keys_str_mv |
AT erikdhuckvale pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation AT matthewwhodgman pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation AT briannabgreenwood pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation AT devorahostucki pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation AT katrisamward pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation AT marktwebbert pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation AT johnskkauwe pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation AT thealzheimersdiseaseneuroimaginginitiative pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation AT thealzheimersdiseasemetabolomicsconsortium pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation AT justinbmiller pairwisecorrelationanalysisofthealzheimersdiseaseneuroimaginginitiativeadnidatasetrevealssignificantfeaturecorrelation |
_version_ |
1718412086044786688 |