Systematic inference of copy-number genotypes from personal genome sequencing data reveals extensive olfactory receptor gene content diversity.

Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel compu...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Sebastian M Waszak, Yehudit Hasin, Thomas Zichner, Tsviya Olender, Ifat Keydar, Miriam Khen, Adrian M Stütz, Andreas Schlattl, Doron Lancet, Jan O Korbel
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2010
Materias:
Acceso en línea:https://doaj.org/article/55e67763b13046bab013d5e4fd461399
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:55e67763b13046bab013d5e4fd461399
record_format dspace
spelling oai:doaj.org-article:55e67763b13046bab013d5e4fd4613992021-11-18T05:51:56ZSystematic inference of copy-number genotypes from personal genome sequencing data reveals extensive olfactory receptor gene content diversity.1553-734X1553-735810.1371/journal.pcbi.1000988https://doaj.org/article/55e67763b13046bab013d5e4fd4613992010-11-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21085617/pdf/?tool=EBIhttps://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95-99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ~15% and ~20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high-throughput sequencing.Sebastian M WaszakYehudit HasinThomas ZichnerTsviya OlenderIfat KeydarMiriam KhenAdrian M StützAndreas SchlattlDoron LancetJan O KorbelPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 6, Iss 11, p e1000988 (2010)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Sebastian M Waszak
Yehudit Hasin
Thomas Zichner
Tsviya Olender
Ifat Keydar
Miriam Khen
Adrian M Stütz
Andreas Schlattl
Doron Lancet
Jan O Korbel
Systematic inference of copy-number genotypes from personal genome sequencing data reveals extensive olfactory receptor gene content diversity.
description Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95-99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ~15% and ~20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high-throughput sequencing.
format article
author Sebastian M Waszak
Yehudit Hasin
Thomas Zichner
Tsviya Olender
Ifat Keydar
Miriam Khen
Adrian M Stütz
Andreas Schlattl
Doron Lancet
Jan O Korbel
author_facet Sebastian M Waszak
Yehudit Hasin
Thomas Zichner
Tsviya Olender
Ifat Keydar
Miriam Khen
Adrian M Stütz
Andreas Schlattl
Doron Lancet
Jan O Korbel
author_sort Sebastian M Waszak
title Systematic inference of copy-number genotypes from personal genome sequencing data reveals extensive olfactory receptor gene content diversity.
title_short Systematic inference of copy-number genotypes from personal genome sequencing data reveals extensive olfactory receptor gene content diversity.
title_full Systematic inference of copy-number genotypes from personal genome sequencing data reveals extensive olfactory receptor gene content diversity.
title_fullStr Systematic inference of copy-number genotypes from personal genome sequencing data reveals extensive olfactory receptor gene content diversity.
title_full_unstemmed Systematic inference of copy-number genotypes from personal genome sequencing data reveals extensive olfactory receptor gene content diversity.
title_sort systematic inference of copy-number genotypes from personal genome sequencing data reveals extensive olfactory receptor gene content diversity.
publisher Public Library of Science (PLoS)
publishDate 2010
url https://doaj.org/article/55e67763b13046bab013d5e4fd461399
work_keys_str_mv AT sebastianmwaszak systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT yehudithasin systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT thomaszichner systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT tsviyaolender systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT ifatkeydar systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT miriamkhen systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT adrianmstutz systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT andreasschlattl systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT doronlancet systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT janokorbel systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
_version_ 1718424725596667904