CREST--classification resources for environmental sequence tags.

Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Anders Lanzén, Steffen L Jørgensen, Daniel H Huson, Markus Gorfer, Svenn Helge Grindhaug, Inge Jonassen, Lise Øvreås, Tim Urich
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2012
Materias:
R
Q
Acceso en línea:https://doaj.org/article/23565c021fc04c53ab4a652ba94b4253
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:23565c021fc04c53ab4a652ba94b4253
record_format dspace
spelling oai:doaj.org-article:23565c021fc04c53ab4a652ba94b42532021-11-18T08:09:19ZCREST--classification resources for environmental sequence tags.1932-620310.1371/journal.pone.0049334https://doaj.org/article/23565c021fc04c53ab4a652ba94b42532012-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23145153/?tool=EBIhttps://doaj.org/toc/1932-6203Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.Anders LanzénSteffen L JørgensenDaniel H HusonMarkus GorferSvenn Helge GrindhaugInge JonassenLise ØvreåsTim UrichPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 7, Iss 11, p e49334 (2012)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Anders Lanzén
Steffen L Jørgensen
Daniel H Huson
Markus Gorfer
Svenn Helge Grindhaug
Inge Jonassen
Lise Øvreås
Tim Urich
CREST--classification resources for environmental sequence tags.
description Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.
format article
author Anders Lanzén
Steffen L Jørgensen
Daniel H Huson
Markus Gorfer
Svenn Helge Grindhaug
Inge Jonassen
Lise Øvreås
Tim Urich
author_facet Anders Lanzén
Steffen L Jørgensen
Daniel H Huson
Markus Gorfer
Svenn Helge Grindhaug
Inge Jonassen
Lise Øvreås
Tim Urich
author_sort Anders Lanzén
title CREST--classification resources for environmental sequence tags.
title_short CREST--classification resources for environmental sequence tags.
title_full CREST--classification resources for environmental sequence tags.
title_fullStr CREST--classification resources for environmental sequence tags.
title_full_unstemmed CREST--classification resources for environmental sequence tags.
title_sort crest--classification resources for environmental sequence tags.
publisher Public Library of Science (PLoS)
publishDate 2012
url https://doaj.org/article/23565c021fc04c53ab4a652ba94b4253
work_keys_str_mv AT anderslanzen crestclassificationresourcesforenvironmentalsequencetags
AT steffenljørgensen crestclassificationresourcesforenvironmentalsequencetags
AT danielhhuson crestclassificationresourcesforenvironmentalsequencetags
AT markusgorfer crestclassificationresourcesforenvironmentalsequencetags
AT svennhelgegrindhaug crestclassificationresourcesforenvironmentalsequencetags
AT ingejonassen crestclassificationresourcesforenvironmentalsequencetags
AT liseøvreas crestclassificationresourcesforenvironmentalsequencetags
AT timurich crestclassificationresourcesforenvironmentalsequencetags
_version_ 1718422093937246208