Most "dark matter" transcripts are associated with known genes.

A series of reports over the last few years have indicated that a much larger portion of the mammalian genome is transcribed than can be accounted for by currently annotated genes, but the quantity and nature of these additional transcripts remains unclear. Here, we have used data from single- and p...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Harm van Bakel, Corey Nislow, Benjamin J Blencowe, Timothy R Hughes
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2010
Materias:
Acceso en línea:https://doaj.org/article/d0c7d555a14b49eebd43e1dbd44e1ef2
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:d0c7d555a14b49eebd43e1dbd44e1ef2
record_format dspace
spelling oai:doaj.org-article:d0c7d555a14b49eebd43e1dbd44e1ef22021-12-02T19:54:51ZMost "dark matter" transcripts are associated with known genes.1544-91731545-788510.1371/journal.pbio.1000371https://doaj.org/article/d0c7d555a14b49eebd43e1dbd44e1ef22010-05-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/20502517/pdf/?tool=EBIhttps://doaj.org/toc/1544-9173https://doaj.org/toc/1545-7885A series of reports over the last few years have indicated that a much larger portion of the mammalian genome is transcribed than can be accounted for by currently annotated genes, but the quantity and nature of these additional transcripts remains unclear. Here, we have used data from single- and paired-end RNA-Seq and tiling arrays to assess the quantity and composition of transcripts in PolyA+ RNA from human and mouse tissues. Relative to tiling arrays, RNA-Seq identifies many fewer transcribed regions ("seqfrags") outside known exons and ncRNAs. Most nonexonic seqfrags are in introns, raising the possibility that they are fragments of pre-mRNAs. The chromosomal locations of the majority of intergenic seqfrags in RNA-Seq data are near known genes, consistent with alternative cleavage and polyadenylation site usage, promoter- and terminator-associated transcripts, or new alternative exons; indeed, reads that bridge splice sites identified 4,544 new exons, affecting 3,554 genes. Most of the remaining seqfrags correspond to either single reads that display characteristics of random sampling from a low-level background or several thousand small transcripts (median length = 111 bp) present at higher levels, which also tend to display sequence conservation and originate from regions with open chromatin. We conclude that, while there are bona fide new intergenic transcripts, their number and abundance is generally low in comparison to known exons, and the genome is not as pervasively transcribed as previously reported.Harm van BakelCorey NislowBenjamin J BlencoweTimothy R HughesPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Biology, Vol 8, Iss 5, p e1000371 (2010)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Harm van Bakel
Corey Nislow
Benjamin J Blencowe
Timothy R Hughes
Most "dark matter" transcripts are associated with known genes.
description A series of reports over the last few years have indicated that a much larger portion of the mammalian genome is transcribed than can be accounted for by currently annotated genes, but the quantity and nature of these additional transcripts remains unclear. Here, we have used data from single- and paired-end RNA-Seq and tiling arrays to assess the quantity and composition of transcripts in PolyA+ RNA from human and mouse tissues. Relative to tiling arrays, RNA-Seq identifies many fewer transcribed regions ("seqfrags") outside known exons and ncRNAs. Most nonexonic seqfrags are in introns, raising the possibility that they are fragments of pre-mRNAs. The chromosomal locations of the majority of intergenic seqfrags in RNA-Seq data are near known genes, consistent with alternative cleavage and polyadenylation site usage, promoter- and terminator-associated transcripts, or new alternative exons; indeed, reads that bridge splice sites identified 4,544 new exons, affecting 3,554 genes. Most of the remaining seqfrags correspond to either single reads that display characteristics of random sampling from a low-level background or several thousand small transcripts (median length = 111 bp) present at higher levels, which also tend to display sequence conservation and originate from regions with open chromatin. We conclude that, while there are bona fide new intergenic transcripts, their number and abundance is generally low in comparison to known exons, and the genome is not as pervasively transcribed as previously reported.
format article
author Harm van Bakel
Corey Nislow
Benjamin J Blencowe
Timothy R Hughes
author_facet Harm van Bakel
Corey Nislow
Benjamin J Blencowe
Timothy R Hughes
author_sort Harm van Bakel
title Most "dark matter" transcripts are associated with known genes.
title_short Most "dark matter" transcripts are associated with known genes.
title_full Most "dark matter" transcripts are associated with known genes.
title_fullStr Most "dark matter" transcripts are associated with known genes.
title_full_unstemmed Most "dark matter" transcripts are associated with known genes.
title_sort most "dark matter" transcripts are associated with known genes.
publisher Public Library of Science (PLoS)
publishDate 2010
url https://doaj.org/article/d0c7d555a14b49eebd43e1dbd44e1ef2
work_keys_str_mv AT harmvanbakel mostdarkmattertranscriptsareassociatedwithknowngenes
AT coreynislow mostdarkmattertranscriptsareassociatedwithknowngenes
AT benjaminjblencowe mostdarkmattertranscriptsareassociatedwithknowngenes
AT timothyrhughes mostdarkmattertranscriptsareassociatedwithknowngenes
_version_ 1718375869809950720