Impact of pre- and post-variant filtration strategies on imputation

Abstract Quality control (QC) methods for genome-wide association studies and fine mapping are commonly used for imputation, however they result in loss of many single nucleotide polymorphisms (SNPs). To investigate the consequences of filtration on imputation, we studied the direct effects on the n...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Céline Charon, Rodrigue Allodji, Vincent Meyer, Jean-François Deleuze
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/2730f7d8a71343b285772a9c544f1ed0
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:2730f7d8a71343b285772a9c544f1ed0
record_format dspace
spelling oai:doaj.org-article:2730f7d8a71343b285772a9c544f1ed02021-12-02T17:04:59ZImpact of pre- and post-variant filtration strategies on imputation10.1038/s41598-021-85333-z2045-2322https://doaj.org/article/2730f7d8a71343b285772a9c544f1ed02021-03-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-85333-zhttps://doaj.org/toc/2045-2322Abstract Quality control (QC) methods for genome-wide association studies and fine mapping are commonly used for imputation, however they result in loss of many single nucleotide polymorphisms (SNPs). To investigate the consequences of filtration on imputation, we studied the direct effects on the number of markers, their allele frequencies, imputation quality scores and post-filtration events. We pre-phrased 1031 genotyped individuals from diverse ethnicities and compared the imputed variants to 1089 NCBI recorded individuals for additional validation. Without QC-based variant pre-filtration, we observed no impairment in the imputation of SNPs that failed QC whereas with pre-filtration there was an overall loss of information. Significant differences between frequencies with and without pre-filtration were found only in the range of very rare (5E−04–1E−03) and rare variants (1E−03–5E−03) (p < 1E−04). Increasing the post-filtration imputation quality score from 0.3 to 0.8 reduced the number of single nucleotide variants (SNVs) < 0.001 2.5 fold with or without QC pre-filtration and halved the number of very rare variants (5E−04). Thus, to maintain confidence and enough SNVs, we propose here a two-step filtering procedure which allows less stringent filtering prior to imputation and post-imputation in order to increase the number of very rare and rare variants compared to conservative filtration methods.Céline CharonRodrigue AllodjiVincent MeyerJean-François DeleuzeNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-12 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Céline Charon
Rodrigue Allodji
Vincent Meyer
Jean-François Deleuze
Impact of pre- and post-variant filtration strategies on imputation
description Abstract Quality control (QC) methods for genome-wide association studies and fine mapping are commonly used for imputation, however they result in loss of many single nucleotide polymorphisms (SNPs). To investigate the consequences of filtration on imputation, we studied the direct effects on the number of markers, their allele frequencies, imputation quality scores and post-filtration events. We pre-phrased 1031 genotyped individuals from diverse ethnicities and compared the imputed variants to 1089 NCBI recorded individuals for additional validation. Without QC-based variant pre-filtration, we observed no impairment in the imputation of SNPs that failed QC whereas with pre-filtration there was an overall loss of information. Significant differences between frequencies with and without pre-filtration were found only in the range of very rare (5E−04–1E−03) and rare variants (1E−03–5E−03) (p < 1E−04). Increasing the post-filtration imputation quality score from 0.3 to 0.8 reduced the number of single nucleotide variants (SNVs) < 0.001 2.5 fold with or without QC pre-filtration and halved the number of very rare variants (5E−04). Thus, to maintain confidence and enough SNVs, we propose here a two-step filtering procedure which allows less stringent filtering prior to imputation and post-imputation in order to increase the number of very rare and rare variants compared to conservative filtration methods.
format article
author Céline Charon
Rodrigue Allodji
Vincent Meyer
Jean-François Deleuze
author_facet Céline Charon
Rodrigue Allodji
Vincent Meyer
Jean-François Deleuze
author_sort Céline Charon
title Impact of pre- and post-variant filtration strategies on imputation
title_short Impact of pre- and post-variant filtration strategies on imputation
title_full Impact of pre- and post-variant filtration strategies on imputation
title_fullStr Impact of pre- and post-variant filtration strategies on imputation
title_full_unstemmed Impact of pre- and post-variant filtration strategies on imputation
title_sort impact of pre- and post-variant filtration strategies on imputation
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/2730f7d8a71343b285772a9c544f1ed0
work_keys_str_mv AT celinecharon impactofpreandpostvariantfiltrationstrategiesonimputation
AT rodrigueallodji impactofpreandpostvariantfiltrationstrategiesonimputation
AT vincentmeyer impactofpreandpostvariantfiltrationstrategiesonimputation
AT jeanfrancoisdeleuze impactofpreandpostvariantfiltrationstrategiesonimputation
_version_ 1718381779633569792