Phylogenomic Identification of Regulatory Sequences in Bacteria: an Analysis of Statistical Power and an Application to <named-content content-type="genus-species">Borrelia burgdorferi</named-content> <italic toggle="yes">Sensu Lato</italic>

ABSTRACT Phylogenomic footprinting is an approach for ab initio identification of genome-wide regulatory elements in bacterial species based on sequence conservation. The statistical power of the phylogenomic approach depends on the degree of sequence conservation, the length of regulatory elements,...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Che I. Martin, Tika Y. Sukarna, Saymon Akther, Girish Ramrattan, Pedro Pagan, Lia Di, Emmanuel F. Mongodin, Claire M. Fraser, Steven E. Schutzer, Benjamin J. Luft, Sherwood R. Casjens, Wei-Gang Qiu
Formato: article
Lenguaje:EN
Publicado: American Society for Microbiology 2015
Materias:
Acceso en línea:https://doaj.org/article/3c0992910e414960a014d8bd104cad13
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:3c0992910e414960a014d8bd104cad13
record_format dspace
spelling oai:doaj.org-article:3c0992910e414960a014d8bd104cad132021-11-15T15:41:33ZPhylogenomic Identification of Regulatory Sequences in Bacteria: an Analysis of Statistical Power and an Application to <named-content content-type="genus-species">Borrelia burgdorferi</named-content> <italic toggle="yes">Sensu Lato</italic>10.1128/mBio.00011-152150-7511https://doaj.org/article/3c0992910e414960a014d8bd104cad132015-05-01T00:00:00Zhttps://journals.asm.org/doi/10.1128/mBio.00011-15https://doaj.org/toc/2150-7511ABSTRACT Phylogenomic footprinting is an approach for ab initio identification of genome-wide regulatory elements in bacterial species based on sequence conservation. The statistical power of the phylogenomic approach depends on the degree of sequence conservation, the length of regulatory elements, and the level of phylogenetic divergence among genomes. Building on an earlier model, we propose a binomial model that uses synonymous tree lengths as neutral expectations for determining the statistical significance of conserved intergenic spacer (IGS) sequences. Simulations show that the binomial model is robust to variations in the value of evolutionary parameters, including base frequencies and the transition-to-transversion ratio. We used the model to search for regulatory sequences in the Lyme disease species group (Borrelia burgdorferi sensu lato) using 23 genomes. The model indicates that the currently available set of Borrelia genomes would not yield regulatory sequences shorter than five bases, suggesting that genome sequences of additional B. burgdorferi sensu lato species are needed. Nevertheless, we show that previously known regulatory elements are indeed strongly conserved in sequence or structure across these Borrelia species. Further, we predict with sufficient confidence two new RpoS binding sites, 39 promoters, 19 transcription terminators, 28 noncoding RNAs, and four sets of coregulated genes. These putative cis- and trans-regulatory elements suggest novel, Borrelia-specific mechanisms regulating the transition between the tick and host environments, a key adaptation and virulence mechanism of B. burgdorferi. Alignments of IGS sequences are available on BorreliaBase.org, an online database of orthologous open reading frame (ORF) and IGS sequences in Borrelia. IMPORTANCE While bacterial genomes contain mostly protein-coding genes, they also house DNA sequences regulating the expression of these genes. Gene regulatory sequences tend to be conserved during evolution. By sequencing and comparing related genomes, one can therefore identify regulatory sequences in bacteria based on sequence conservation. Here, we describe a statistical framework by which one may determine how many genomes need to be sequenced and at what level of evolutionary relatedness in order to achieve a high level of statistical significance. We applied the framework to Borrelia burgdorferi, the Lyme disease agent, and identified a large number of candidate regulatory sequences, many of which are known to be involved in regulating the phase transition between the tick vector and mammalian hosts.Che I. MartinTika Y. SukarnaSaymon AktherGirish RamrattanPedro PaganLia DiEmmanuel F. MongodinClaire M. FraserSteven E. SchutzerBenjamin J. LuftSherwood R. CasjensWei-Gang QiuAmerican Society for MicrobiologyarticleMicrobiologyQR1-502ENmBio, Vol 6, Iss 2 (2015)
institution DOAJ
collection DOAJ
language EN
topic Microbiology
QR1-502
spellingShingle Microbiology
QR1-502
Che I. Martin
Tika Y. Sukarna
Saymon Akther
Girish Ramrattan
Pedro Pagan
Lia Di
Emmanuel F. Mongodin
Claire M. Fraser
Steven E. Schutzer
Benjamin J. Luft
Sherwood R. Casjens
Wei-Gang Qiu
Phylogenomic Identification of Regulatory Sequences in Bacteria: an Analysis of Statistical Power and an Application to <named-content content-type="genus-species">Borrelia burgdorferi</named-content> <italic toggle="yes">Sensu Lato</italic>
description ABSTRACT Phylogenomic footprinting is an approach for ab initio identification of genome-wide regulatory elements in bacterial species based on sequence conservation. The statistical power of the phylogenomic approach depends on the degree of sequence conservation, the length of regulatory elements, and the level of phylogenetic divergence among genomes. Building on an earlier model, we propose a binomial model that uses synonymous tree lengths as neutral expectations for determining the statistical significance of conserved intergenic spacer (IGS) sequences. Simulations show that the binomial model is robust to variations in the value of evolutionary parameters, including base frequencies and the transition-to-transversion ratio. We used the model to search for regulatory sequences in the Lyme disease species group (Borrelia burgdorferi sensu lato) using 23 genomes. The model indicates that the currently available set of Borrelia genomes would not yield regulatory sequences shorter than five bases, suggesting that genome sequences of additional B. burgdorferi sensu lato species are needed. Nevertheless, we show that previously known regulatory elements are indeed strongly conserved in sequence or structure across these Borrelia species. Further, we predict with sufficient confidence two new RpoS binding sites, 39 promoters, 19 transcription terminators, 28 noncoding RNAs, and four sets of coregulated genes. These putative cis- and trans-regulatory elements suggest novel, Borrelia-specific mechanisms regulating the transition between the tick and host environments, a key adaptation and virulence mechanism of B. burgdorferi. Alignments of IGS sequences are available on BorreliaBase.org, an online database of orthologous open reading frame (ORF) and IGS sequences in Borrelia. IMPORTANCE While bacterial genomes contain mostly protein-coding genes, they also house DNA sequences regulating the expression of these genes. Gene regulatory sequences tend to be conserved during evolution. By sequencing and comparing related genomes, one can therefore identify regulatory sequences in bacteria based on sequence conservation. Here, we describe a statistical framework by which one may determine how many genomes need to be sequenced and at what level of evolutionary relatedness in order to achieve a high level of statistical significance. We applied the framework to Borrelia burgdorferi, the Lyme disease agent, and identified a large number of candidate regulatory sequences, many of which are known to be involved in regulating the phase transition between the tick vector and mammalian hosts.
format article
author Che I. Martin
Tika Y. Sukarna
Saymon Akther
Girish Ramrattan
Pedro Pagan
Lia Di
Emmanuel F. Mongodin
Claire M. Fraser
Steven E. Schutzer
Benjamin J. Luft
Sherwood R. Casjens
Wei-Gang Qiu
author_facet Che I. Martin
Tika Y. Sukarna
Saymon Akther
Girish Ramrattan
Pedro Pagan
Lia Di
Emmanuel F. Mongodin
Claire M. Fraser
Steven E. Schutzer
Benjamin J. Luft
Sherwood R. Casjens
Wei-Gang Qiu
author_sort Che I. Martin
title Phylogenomic Identification of Regulatory Sequences in Bacteria: an Analysis of Statistical Power and an Application to <named-content content-type="genus-species">Borrelia burgdorferi</named-content> <italic toggle="yes">Sensu Lato</italic>
title_short Phylogenomic Identification of Regulatory Sequences in Bacteria: an Analysis of Statistical Power and an Application to <named-content content-type="genus-species">Borrelia burgdorferi</named-content> <italic toggle="yes">Sensu Lato</italic>
title_full Phylogenomic Identification of Regulatory Sequences in Bacteria: an Analysis of Statistical Power and an Application to <named-content content-type="genus-species">Borrelia burgdorferi</named-content> <italic toggle="yes">Sensu Lato</italic>
title_fullStr Phylogenomic Identification of Regulatory Sequences in Bacteria: an Analysis of Statistical Power and an Application to <named-content content-type="genus-species">Borrelia burgdorferi</named-content> <italic toggle="yes">Sensu Lato</italic>
title_full_unstemmed Phylogenomic Identification of Regulatory Sequences in Bacteria: an Analysis of Statistical Power and an Application to <named-content content-type="genus-species">Borrelia burgdorferi</named-content> <italic toggle="yes">Sensu Lato</italic>
title_sort phylogenomic identification of regulatory sequences in bacteria: an analysis of statistical power and an application to <named-content content-type="genus-species">borrelia burgdorferi</named-content> <italic toggle="yes">sensu lato</italic>
publisher American Society for Microbiology
publishDate 2015
url https://doaj.org/article/3c0992910e414960a014d8bd104cad13
work_keys_str_mv AT cheimartin phylogenomicidentificationofregulatorysequencesinbacteriaananalysisofstatisticalpowerandanapplicationtonamedcontentcontenttypegenusspeciesborreliaburgdorferinamedcontentitalictoggleyessensulatoitalic
AT tikaysukarna phylogenomicidentificationofregulatorysequencesinbacteriaananalysisofstatisticalpowerandanapplicationtonamedcontentcontenttypegenusspeciesborreliaburgdorferinamedcontentitalictoggleyessensulatoitalic
AT saymonakther phylogenomicidentificationofregulatorysequencesinbacteriaananalysisofstatisticalpowerandanapplicationtonamedcontentcontenttypegenusspeciesborreliaburgdorferinamedcontentitalictoggleyessensulatoitalic
AT girishramrattan phylogenomicidentificationofregulatorysequencesinbacteriaananalysisofstatisticalpowerandanapplicationtonamedcontentcontenttypegenusspeciesborreliaburgdorferinamedcontentitalictoggleyessensulatoitalic
AT pedropagan phylogenomicidentificationofregulatorysequencesinbacteriaananalysisofstatisticalpowerandanapplicationtonamedcontentcontenttypegenusspeciesborreliaburgdorferinamedcontentitalictoggleyessensulatoitalic
AT liadi phylogenomicidentificationofregulatorysequencesinbacteriaananalysisofstatisticalpowerandanapplicationtonamedcontentcontenttypegenusspeciesborreliaburgdorferinamedcontentitalictoggleyessensulatoitalic
AT emmanuelfmongodin phylogenomicidentificationofregulatorysequencesinbacteriaananalysisofstatisticalpowerandanapplicationtonamedcontentcontenttypegenusspeciesborreliaburgdorferinamedcontentitalictoggleyessensulatoitalic
AT clairemfraser phylogenomicidentificationofregulatorysequencesinbacteriaananalysisofstatisticalpowerandanapplicationtonamedcontentcontenttypegenusspeciesborreliaburgdorferinamedcontentitalictoggleyessensulatoitalic
AT steveneschutzer phylogenomicidentificationofregulatorysequencesinbacteriaananalysisofstatisticalpowerandanapplicationtonamedcontentcontenttypegenusspeciesborreliaburgdorferinamedcontentitalictoggleyessensulatoitalic
AT benjaminjluft phylogenomicidentificationofregulatorysequencesinbacteriaananalysisofstatisticalpowerandanapplicationtonamedcontentcontenttypegenusspeciesborreliaburgdorferinamedcontentitalictoggleyessensulatoitalic
AT sherwoodrcasjens phylogenomicidentificationofregulatorysequencesinbacteriaananalysisofstatisticalpowerandanapplicationtonamedcontentcontenttypegenusspeciesborreliaburgdorferinamedcontentitalictoggleyessensulatoitalic
AT weigangqiu phylogenomicidentificationofregulatorysequencesinbacteriaananalysisofstatisticalpowerandanapplicationtonamedcontentcontenttypegenusspeciesborreliaburgdorferinamedcontentitalictoggleyessensulatoitalic
_version_ 1718427637826715648