Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies

Abstract With the establishment of large biobanks, discovery of single nucleotide variants (SNVs, also known as single nucleotide polymorphisms (SNVs)) associated with various phenotypes has accelerated. An open question is whether genome-wide significant SNVs identified in earlier genome-wide assoc...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Jack W. O’Sullivan, John P. A. Ioannidis
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/e525b95011534433b200d6aaa6cf748d
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:e525b95011534433b200d6aaa6cf748d
record_format dspace
spelling oai:doaj.org-article:e525b95011534433b200d6aaa6cf748d2021-12-02T17:27:19ZReproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies10.1038/s41598-021-97896-y2045-2322https://doaj.org/article/e525b95011534433b200d6aaa6cf748d2021-09-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-97896-yhttps://doaj.org/toc/2045-2322Abstract With the establishment of large biobanks, discovery of single nucleotide variants (SNVs, also known as single nucleotide polymorphisms (SNVs)) associated with various phenotypes has accelerated. An open question is whether genome-wide significant SNVs identified in earlier genome-wide association studies (GWAS) are replicated in later GWAS conducted in biobanks. To address this, we examined a publicly available GWAS database and identified two, independent GWAS on the same phenotype (an earlier, “discovery” GWAS and a later, “replication” GWAS done in the UK biobank). The analysis evaluated 136,318,924 SNVs (of which 6289 reached P < 5e−8 in the discovery GWAS) from 4,397,962 participants across nine phenotypes. The overall replication rate was 85.0%; although lower for binary than quantitative phenotypes (58.1% versus 94.8% respectively). There was a 18.0% decrease in SNV effect size for binary phenotypes, but a 12.0% increase for quantitative phenotypes. Using the discovery SNV effect size, phenotype trait (binary or quantitative), and discovery P value, we built and validated a model that predicted SNV replication with area under the Receiver Operator Curve = 0.90. While non-replication may reflect lack of power rather than genuine false-positives, these results provide insights about which discovered associations are likely to be replicated across subsequent GWAS.Jack W. O’SullivanJohn P. A. IoannidisNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-7 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Jack W. O’Sullivan
John P. A. Ioannidis
Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies
description Abstract With the establishment of large biobanks, discovery of single nucleotide variants (SNVs, also known as single nucleotide polymorphisms (SNVs)) associated with various phenotypes has accelerated. An open question is whether genome-wide significant SNVs identified in earlier genome-wide association studies (GWAS) are replicated in later GWAS conducted in biobanks. To address this, we examined a publicly available GWAS database and identified two, independent GWAS on the same phenotype (an earlier, “discovery” GWAS and a later, “replication” GWAS done in the UK biobank). The analysis evaluated 136,318,924 SNVs (of which 6289 reached P < 5e−8 in the discovery GWAS) from 4,397,962 participants across nine phenotypes. The overall replication rate was 85.0%; although lower for binary than quantitative phenotypes (58.1% versus 94.8% respectively). There was a 18.0% decrease in SNV effect size for binary phenotypes, but a 12.0% increase for quantitative phenotypes. Using the discovery SNV effect size, phenotype trait (binary or quantitative), and discovery P value, we built and validated a model that predicted SNV replication with area under the Receiver Operator Curve = 0.90. While non-replication may reflect lack of power rather than genuine false-positives, these results provide insights about which discovered associations are likely to be replicated across subsequent GWAS.
format article
author Jack W. O’Sullivan
John P. A. Ioannidis
author_facet Jack W. O’Sullivan
John P. A. Ioannidis
author_sort Jack W. O’Sullivan
title Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies
title_short Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies
title_full Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies
title_fullStr Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies
title_full_unstemmed Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies
title_sort reproducibility in the uk biobank of genome-wide significant signals discovered in earlier genome-wide association studies
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/e525b95011534433b200d6aaa6cf748d
work_keys_str_mv AT jackwosullivan reproducibilityintheukbiobankofgenomewidesignificantsignalsdiscoveredinearliergenomewideassociationstudies
AT johnpaioannidis reproducibilityintheukbiobankofgenomewidesignificantsignalsdiscoveredinearliergenomewideassociationstudies
_version_ 1718380821366177792