The random nature of genome architecture: predicting open reading frame distributions.

<h4>Background</h4>A better understanding of the size and abundance of open reading frames (ORFS) in whole genomes may shed light on the factors that control genome complexity. Here we examine the statistical distributions of open reading frames (i.e. distribution of start and stop codon...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Michael W McCoy, Andrew P Allen, James F Gillooly
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2009
Materias:
R
Q
Acceso en línea:https://doaj.org/article/89956892f6f941b79ffaaa5a980aca04
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:89956892f6f941b79ffaaa5a980aca04
record_format dspace
spelling oai:doaj.org-article:89956892f6f941b79ffaaa5a980aca042021-11-25T06:21:19ZThe random nature of genome architecture: predicting open reading frame distributions.1932-620310.1371/journal.pone.0006456https://doaj.org/article/89956892f6f941b79ffaaa5a980aca042009-07-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/19649247/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203<h4>Background</h4>A better understanding of the size and abundance of open reading frames (ORFS) in whole genomes may shed light on the factors that control genome complexity. Here we examine the statistical distributions of open reading frames (i.e. distribution of start and stop codons) in the fully sequenced genomes of 297 prokaryotes, and 14 eukaryotes.<h4>Methodology/principal findings</h4>By fitting mixture models to data from whole genome sequences we show that the size-frequency distributions for ORFS are strikingly similar across prokaryotic and eukaryotic genomes. Moreover, we show that i) a large fraction (60-80%) of ORF size-frequency distributions can be predicted a priori with a stochastic assembly model based on GC content, and that (ii) size-frequency distributions of the remaining "non-random" ORFs are well-fitted by log-normal or gamma distributions, and similar to the size distributions of annotated proteins.<h4>Conclusions/significance</h4>Our findings suggest stochastic processes have played a primary role in the evolution of genome complexity, and that common processes govern the conservation and loss of functional genomics units in both prokaryotes and eukaryotes.Michael W McCoyAndrew P AllenJames F GilloolyPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 4, Iss 7, p e6456 (2009)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Michael W McCoy
Andrew P Allen
James F Gillooly
The random nature of genome architecture: predicting open reading frame distributions.
description <h4>Background</h4>A better understanding of the size and abundance of open reading frames (ORFS) in whole genomes may shed light on the factors that control genome complexity. Here we examine the statistical distributions of open reading frames (i.e. distribution of start and stop codons) in the fully sequenced genomes of 297 prokaryotes, and 14 eukaryotes.<h4>Methodology/principal findings</h4>By fitting mixture models to data from whole genome sequences we show that the size-frequency distributions for ORFS are strikingly similar across prokaryotic and eukaryotic genomes. Moreover, we show that i) a large fraction (60-80%) of ORF size-frequency distributions can be predicted a priori with a stochastic assembly model based on GC content, and that (ii) size-frequency distributions of the remaining "non-random" ORFs are well-fitted by log-normal or gamma distributions, and similar to the size distributions of annotated proteins.<h4>Conclusions/significance</h4>Our findings suggest stochastic processes have played a primary role in the evolution of genome complexity, and that common processes govern the conservation and loss of functional genomics units in both prokaryotes and eukaryotes.
format article
author Michael W McCoy
Andrew P Allen
James F Gillooly
author_facet Michael W McCoy
Andrew P Allen
James F Gillooly
author_sort Michael W McCoy
title The random nature of genome architecture: predicting open reading frame distributions.
title_short The random nature of genome architecture: predicting open reading frame distributions.
title_full The random nature of genome architecture: predicting open reading frame distributions.
title_fullStr The random nature of genome architecture: predicting open reading frame distributions.
title_full_unstemmed The random nature of genome architecture: predicting open reading frame distributions.
title_sort random nature of genome architecture: predicting open reading frame distributions.
publisher Public Library of Science (PLoS)
publishDate 2009
url https://doaj.org/article/89956892f6f941b79ffaaa5a980aca04
work_keys_str_mv AT michaelwmccoy therandomnatureofgenomearchitecturepredictingopenreadingframedistributions
AT andrewpallen therandomnatureofgenomearchitecturepredictingopenreadingframedistributions
AT jamesfgillooly therandomnatureofgenomearchitecturepredictingopenreadingframedistributions
AT michaelwmccoy randomnatureofgenomearchitecturepredictingopenreadingframedistributions
AT andrewpallen randomnatureofgenomearchitecturepredictingopenreadingframedistributions
AT jamesfgillooly randomnatureofgenomearchitecturepredictingopenreadingframedistributions
_version_ 1718413812336427008