Thousands of high-quality sequencing samples fail to show meaningful correlation between 5S and 45S ribosomal DNA arrays in humans

Abstract The ribosomal RNA genes (rDNA) are tandemly arrayed in most eukaryotes and exhibit vast copy number variation. There is growing interest in integrating this variation into genotype–phenotype associations. Here, we explored a possible association of rDNA copy number variation with autism spe...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Ashley N. Hall, Tychele N. Turner, Christine Queitsch
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/81d802df5c8344f9810a6cf9b91b6a9c
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:81d802df5c8344f9810a6cf9b91b6a9c
record_format dspace
spelling oai:doaj.org-article:81d802df5c8344f9810a6cf9b91b6a9c2021-12-02T14:12:08ZThousands of high-quality sequencing samples fail to show meaningful correlation between 5S and 45S ribosomal DNA arrays in humans10.1038/s41598-020-80049-y2045-2322https://doaj.org/article/81d802df5c8344f9810a6cf9b91b6a9c2021-01-01T00:00:00Zhttps://doi.org/10.1038/s41598-020-80049-yhttps://doaj.org/toc/2045-2322Abstract The ribosomal RNA genes (rDNA) are tandemly arrayed in most eukaryotes and exhibit vast copy number variation. There is growing interest in integrating this variation into genotype–phenotype associations. Here, we explored a possible association of rDNA copy number variation with autism spectrum disorder and found no difference between probands and unaffected siblings. Because short-read sequencing estimates of rDNA copy number are error prone, we sought to validate our 45S estimates. Previous studies reported tightly correlated, concerted copy number variation between the 45S and 5S arrays, which should enable the validation of 45S copy number estimates with pulsed-field gel-verified 5S copy numbers. Here, we show that the previously reported strong concerted copy number variation may be an artifact of variable data quality in the earlier published 1000 Genomes Project sequences. We failed to detect a meaningful correlation between 45S and 5S copy numbers in thousands of samples from the high-coverage Simons Simplex Collection dataset as well as in the recent high-coverage 1000 Genomes Project sequences. Our findings illustrate the challenge of genotyping repetitive DNA regions accurately and call into question the accuracy of recently published studies of rDNA copy number variation in cancer that relied on diverse publicly available resources for sequence data.Ashley N. HallTychele N. TurnerChristine QueitschNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-12 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Ashley N. Hall
Tychele N. Turner
Christine Queitsch
Thousands of high-quality sequencing samples fail to show meaningful correlation between 5S and 45S ribosomal DNA arrays in humans
description Abstract The ribosomal RNA genes (rDNA) are tandemly arrayed in most eukaryotes and exhibit vast copy number variation. There is growing interest in integrating this variation into genotype–phenotype associations. Here, we explored a possible association of rDNA copy number variation with autism spectrum disorder and found no difference between probands and unaffected siblings. Because short-read sequencing estimates of rDNA copy number are error prone, we sought to validate our 45S estimates. Previous studies reported tightly correlated, concerted copy number variation between the 45S and 5S arrays, which should enable the validation of 45S copy number estimates with pulsed-field gel-verified 5S copy numbers. Here, we show that the previously reported strong concerted copy number variation may be an artifact of variable data quality in the earlier published 1000 Genomes Project sequences. We failed to detect a meaningful correlation between 45S and 5S copy numbers in thousands of samples from the high-coverage Simons Simplex Collection dataset as well as in the recent high-coverage 1000 Genomes Project sequences. Our findings illustrate the challenge of genotyping repetitive DNA regions accurately and call into question the accuracy of recently published studies of rDNA copy number variation in cancer that relied on diverse publicly available resources for sequence data.
format article
author Ashley N. Hall
Tychele N. Turner
Christine Queitsch
author_facet Ashley N. Hall
Tychele N. Turner
Christine Queitsch
author_sort Ashley N. Hall
title Thousands of high-quality sequencing samples fail to show meaningful correlation between 5S and 45S ribosomal DNA arrays in humans
title_short Thousands of high-quality sequencing samples fail to show meaningful correlation between 5S and 45S ribosomal DNA arrays in humans
title_full Thousands of high-quality sequencing samples fail to show meaningful correlation between 5S and 45S ribosomal DNA arrays in humans
title_fullStr Thousands of high-quality sequencing samples fail to show meaningful correlation between 5S and 45S ribosomal DNA arrays in humans
title_full_unstemmed Thousands of high-quality sequencing samples fail to show meaningful correlation between 5S and 45S ribosomal DNA arrays in humans
title_sort thousands of high-quality sequencing samples fail to show meaningful correlation between 5s and 45s ribosomal dna arrays in humans
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/81d802df5c8344f9810a6cf9b91b6a9c
work_keys_str_mv AT ashleynhall thousandsofhighqualitysequencingsamplesfailtoshowmeaningfulcorrelationbetween5sand45sribosomaldnaarraysinhumans
AT tychelenturner thousandsofhighqualitysequencingsamplesfailtoshowmeaningfulcorrelationbetween5sand45sribosomaldnaarraysinhumans
AT christinequeitsch thousandsofhighqualitysequencingsamplesfailtoshowmeaningfulcorrelationbetween5sand45sribosomaldnaarraysinhumans
_version_ 1718391854222802944