Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies
Abstract With the establishment of large biobanks, discovery of single nucleotide variants (SNVs, also known as single nucleotide polymorphisms (SNVs)) associated with various phenotypes has accelerated. An open question is whether genome-wide significant SNVs identified in earlier genome-wide assoc...
Guardado en:
Autores principales: | , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Nature Portfolio
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/e525b95011534433b200d6aaa6cf748d |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:e525b95011534433b200d6aaa6cf748d |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:e525b95011534433b200d6aaa6cf748d2021-12-02T17:27:19ZReproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies10.1038/s41598-021-97896-y2045-2322https://doaj.org/article/e525b95011534433b200d6aaa6cf748d2021-09-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-97896-yhttps://doaj.org/toc/2045-2322Abstract With the establishment of large biobanks, discovery of single nucleotide variants (SNVs, also known as single nucleotide polymorphisms (SNVs)) associated with various phenotypes has accelerated. An open question is whether genome-wide significant SNVs identified in earlier genome-wide association studies (GWAS) are replicated in later GWAS conducted in biobanks. To address this, we examined a publicly available GWAS database and identified two, independent GWAS on the same phenotype (an earlier, “discovery” GWAS and a later, “replication” GWAS done in the UK biobank). The analysis evaluated 136,318,924 SNVs (of which 6289 reached P < 5e−8 in the discovery GWAS) from 4,397,962 participants across nine phenotypes. The overall replication rate was 85.0%; although lower for binary than quantitative phenotypes (58.1% versus 94.8% respectively). There was a 18.0% decrease in SNV effect size for binary phenotypes, but a 12.0% increase for quantitative phenotypes. Using the discovery SNV effect size, phenotype trait (binary or quantitative), and discovery P value, we built and validated a model that predicted SNV replication with area under the Receiver Operator Curve = 0.90. While non-replication may reflect lack of power rather than genuine false-positives, these results provide insights about which discovered associations are likely to be replicated across subsequent GWAS.Jack W. O’SullivanJohn P. A. IoannidisNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-7 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Jack W. O’Sullivan John P. A. Ioannidis Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies |
description |
Abstract With the establishment of large biobanks, discovery of single nucleotide variants (SNVs, also known as single nucleotide polymorphisms (SNVs)) associated with various phenotypes has accelerated. An open question is whether genome-wide significant SNVs identified in earlier genome-wide association studies (GWAS) are replicated in later GWAS conducted in biobanks. To address this, we examined a publicly available GWAS database and identified two, independent GWAS on the same phenotype (an earlier, “discovery” GWAS and a later, “replication” GWAS done in the UK biobank). The analysis evaluated 136,318,924 SNVs (of which 6289 reached P < 5e−8 in the discovery GWAS) from 4,397,962 participants across nine phenotypes. The overall replication rate was 85.0%; although lower for binary than quantitative phenotypes (58.1% versus 94.8% respectively). There was a 18.0% decrease in SNV effect size for binary phenotypes, but a 12.0% increase for quantitative phenotypes. Using the discovery SNV effect size, phenotype trait (binary or quantitative), and discovery P value, we built and validated a model that predicted SNV replication with area under the Receiver Operator Curve = 0.90. While non-replication may reflect lack of power rather than genuine false-positives, these results provide insights about which discovered associations are likely to be replicated across subsequent GWAS. |
format |
article |
author |
Jack W. O’Sullivan John P. A. Ioannidis |
author_facet |
Jack W. O’Sullivan John P. A. Ioannidis |
author_sort |
Jack W. O’Sullivan |
title |
Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies |
title_short |
Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies |
title_full |
Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies |
title_fullStr |
Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies |
title_full_unstemmed |
Reproducibility in the UK biobank of genome-wide significant signals discovered in earlier genome-wide association studies |
title_sort |
reproducibility in the uk biobank of genome-wide significant signals discovered in earlier genome-wide association studies |
publisher |
Nature Portfolio |
publishDate |
2021 |
url |
https://doaj.org/article/e525b95011534433b200d6aaa6cf748d |
work_keys_str_mv |
AT jackwosullivan reproducibilityintheukbiobankofgenomewidesignificantsignalsdiscoveredinearliergenomewideassociationstudies AT johnpaioannidis reproducibilityintheukbiobankofgenomewidesignificantsignalsdiscoveredinearliergenomewideassociationstudies |
_version_ |
1718380821366177792 |