An automated phylogenetic tree-based small subunit rRNA taxonomy and alignment pipeline (STAP).

Comparative analysis of small-subunit ribosomal RNA (ss-rRNA) gene sequences forms the basis for much of what we know about the phylogenetic diversity of both cultured and uncultured microorganisms. As sequencing costs continue to decline and throughput increases, sequences of ss-rRNA genes are bein...

Description complète

Enregistré dans:
Détails bibliographiques
Auteurs principaux: Dongying Wu, Amber Hartman, Naomi Ward, Jonathan A Eisen
Format: article
Langue:EN
Publié: Public Library of Science (PLoS) 2008
Sujets:
R
Q
Accès en ligne:https://doaj.org/article/de1cb5df00944db6856ef291399201d4
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
Description
Résumé:Comparative analysis of small-subunit ribosomal RNA (ss-rRNA) gene sequences forms the basis for much of what we know about the phylogenetic diversity of both cultured and uncultured microorganisms. As sequencing costs continue to decline and throughput increases, sequences of ss-rRNA genes are being obtained at an ever-increasing rate. This increasing flow of data has opened many new windows into microbial diversity and evolution, and at the same time has created significant methodological challenges. Those processes which commonly require time-consuming human intervention, such as the preparation of multiple sequence alignments, simply cannot keep up with the flood of incoming data. Fully automated methods of analysis are needed. Notably, existing automated methods avoid one or more steps that, though computationally costly or difficult, we consider to be important. In particular, we regard both the building of multiple sequence alignments and the performance of high quality phylogenetic analysis to be necessary. We describe here our fully-automated ss-rRNA taxonomy and alignment pipeline (STAP). It generates both high-quality multiple sequence alignments and phylogenetic trees, and thus can be used for multiple purposes including phylogenetically-based taxonomic assignments and analysis of species diversity in environmental samples. The pipeline combines publicly-available packages (PHYML, BLASTN and CLUSTALW) with our automatic alignment, masking, and tree-parsing programs. Most importantly, this automated process yields results comparable to those achievable by manual analysis, yet offers speed and capacity that are unattainable by manual efforts.