Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae

Abstract Extensive transcriptional activity occurring in intergenic regions of genomes has raised the question whether intergenic transcription represents the activity of novel genes or noisy expression. To address this, we evaluated cross-species and post-duplication sequence and expression conserv...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: John P. Lloyd, Megan J. Bowman, Christina B. Azodi, Rosalie P. Sowers, Gaurav D. Moghe, Kevin L. Childs, Shin-Han Shiu
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2019
Materias:
R
Q
Acceso en línea:https://doaj.org/article/65250de186414f50a6c8138c34bea35e
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:65250de186414f50a6c8138c34bea35e
record_format dspace
spelling oai:doaj.org-article:65250de186414f50a6c8138c34bea35e2021-12-02T15:08:45ZEvolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae10.1038/s41598-019-47797-y2045-2322https://doaj.org/article/65250de186414f50a6c8138c34bea35e2019-08-01T00:00:00Zhttps://doi.org/10.1038/s41598-019-47797-yhttps://doaj.org/toc/2045-2322Abstract Extensive transcriptional activity occurring in intergenic regions of genomes has raised the question whether intergenic transcription represents the activity of novel genes or noisy expression. To address this, we evaluated cross-species and post-duplication sequence and expression conservation of intergenic transcribed regions (ITRs) in four Poaceae species. Among 43,301 ITRs across the four species, 34,460 (80%) are species-specific. ITRs found across species tend to be more divergent in expression and have more recent duplicates compared to annotated genes. To assess if ITRs are functional (under selection), machine learning models were established in Oryza sativa (rice) that could accurately distinguish between phenotype genes and pseudogenes (area under curve-receiver operating characteristic = 0.94). Based on the models, 584 (8%) and 4391 (61%) rice ITRs are classified as likely functional and nonfunctional with high confidence, respectively. ITRs with conserved expression and ancient retained duplicates, features that were not part of the model, are frequently classified as likely-functional, suggesting these characteristics could serve as pragmatic rules of thumb for identifying candidate sequences likely to be under selection. This study also provides a framework to identify novel genes using comparative transcriptomic data to improve genome annotation that is fundamental for connecting genotype to phenotype in crop and model systems.John P. LloydMegan J. BowmanChristina B. AzodiRosalie P. SowersGaurav D. MogheKevin L. ChildsShin-Han ShiuNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 9, Iss 1, Pp 1-14 (2019)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
John P. Lloyd
Megan J. Bowman
Christina B. Azodi
Rosalie P. Sowers
Gaurav D. Moghe
Kevin L. Childs
Shin-Han Shiu
Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae
description Abstract Extensive transcriptional activity occurring in intergenic regions of genomes has raised the question whether intergenic transcription represents the activity of novel genes or noisy expression. To address this, we evaluated cross-species and post-duplication sequence and expression conservation of intergenic transcribed regions (ITRs) in four Poaceae species. Among 43,301 ITRs across the four species, 34,460 (80%) are species-specific. ITRs found across species tend to be more divergent in expression and have more recent duplicates compared to annotated genes. To assess if ITRs are functional (under selection), machine learning models were established in Oryza sativa (rice) that could accurately distinguish between phenotype genes and pseudogenes (area under curve-receiver operating characteristic = 0.94). Based on the models, 584 (8%) and 4391 (61%) rice ITRs are classified as likely functional and nonfunctional with high confidence, respectively. ITRs with conserved expression and ancient retained duplicates, features that were not part of the model, are frequently classified as likely-functional, suggesting these characteristics could serve as pragmatic rules of thumb for identifying candidate sequences likely to be under selection. This study also provides a framework to identify novel genes using comparative transcriptomic data to improve genome annotation that is fundamental for connecting genotype to phenotype in crop and model systems.
format article
author John P. Lloyd
Megan J. Bowman
Christina B. Azodi
Rosalie P. Sowers
Gaurav D. Moghe
Kevin L. Childs
Shin-Han Shiu
author_facet John P. Lloyd
Megan J. Bowman
Christina B. Azodi
Rosalie P. Sowers
Gaurav D. Moghe
Kevin L. Childs
Shin-Han Shiu
author_sort John P. Lloyd
title Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae
title_short Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae
title_full Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae
title_fullStr Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae
title_full_unstemmed Evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the Poaceae
title_sort evolutionary characteristics of intergenic transcribed regions indicate rare novel genes and widespread noisy transcription in the poaceae
publisher Nature Portfolio
publishDate 2019
url https://doaj.org/article/65250de186414f50a6c8138c34bea35e
work_keys_str_mv AT johnplloyd evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae
AT meganjbowman evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae
AT christinabazodi evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae
AT rosaliepsowers evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae
AT gauravdmoghe evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae
AT kevinlchilds evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae
AT shinhanshiu evolutionarycharacteristicsofintergenictranscribedregionsindicaterarenovelgenesandwidespreadnoisytranscriptioninthepoaceae
_version_ 1718388002243215360