SSUnique: Detecting Sequence Novelty in Microbiome Surveys

ABSTRACT High-throughput sequencing of small-subunit (SSU) rRNA genes has revolutionized understanding of microbial communities and facilitated investigations into ecological dynamics at unprecedented scales. Such extensive SSU rRNA gene sequence libraries, constructed from DNA extracts of environme...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Michael D. J. Lynch, Josh D. Neufeld
Formato: article
Lenguaje:EN
Publicado: American Society for Microbiology 2016
Materias:
Acceso en línea:https://doaj.org/article/ec8357535c4b40f2b4f0b8af56646b1b
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:ec8357535c4b40f2b4f0b8af56646b1b
record_format dspace
spelling oai:doaj.org-article:ec8357535c4b40f2b4f0b8af56646b1b2021-12-02T19:48:49ZSSUnique: Detecting Sequence Novelty in Microbiome Surveys10.1128/mSystems.00133-162379-5077https://doaj.org/article/ec8357535c4b40f2b4f0b8af56646b1b2016-12-01T00:00:00Zhttps://journals.asm.org/doi/10.1128/mSystems.00133-16https://doaj.org/toc/2379-5077ABSTRACT High-throughput sequencing of small-subunit (SSU) rRNA genes has revolutionized understanding of microbial communities and facilitated investigations into ecological dynamics at unprecedented scales. Such extensive SSU rRNA gene sequence libraries, constructed from DNA extracts of environmental or host-associated samples, often contain a substantial proportion of unclassified sequences, many representing organisms with novel taxonomy (taxonomic “blind spots”) and potentially unique ecology. Indeed, these novel taxonomic lineages are associated with so-called microbial “dark matter,” which is the genomic potential of these lineages. Unfortunately, characterization beyond “unclassified” is challenging due to relatively short read lengths and large data set sizes. Here we demonstrate how mining of phylogenetically novel sequences from microbial ecosystems can be automated using SSUnique, a software pipeline that filters unclassified and/or rare operational taxonomic units (OTUs) from 16S rRNA gene sequence libraries by screening against consensus structural models for SSU rRNA. Phylogenetic position is inferred against a reference data set, and additional characterization of novel clades is also included, such as targeted probe/primer design and mining of assembled metagenomes for genomic context. We show how SSUnique reproduced a previous analysis of phylogenetic novelty from an Arctic tundra soil and demonstrate the recovery of highly novel clades from data sets associated with both the Earth Microbiome Project (EMP) and Human Microbiome Project (HMP). We anticipate that SSUnique will add to the expanding computational toolbox supporting high-throughput sequencing approaches for the study of microbial ecology and phylogeny. IMPORTANCE Extensive SSU rRNA gene sequence libraries, constructed from DNA extracts of environmental or host-associated samples, often contain many unclassified sequences, many representing organisms with novel taxonomy (taxonomic “blind spots”) and potentially unique ecology. This novelty is poorly explored in standard workflows, which narrows the breadth and discovery potential of such studies. Here we present the SSUnique analysis pipeline, which will promote the exploration of unclassified diversity in microbiome research and, importantly, enable the discovery of substantial novel taxonomic lineages through the analysis of a large variety of existing data sets.Michael D. J. LynchJosh D. NeufeldAmerican Society for Microbiologyarticle16S rRNAhigh-throughput sequencingmicrobial dark mattermicrobiomerare biospheretaxonomic blind spotsMicrobiologyQR1-502ENmSystems, Vol 1, Iss 6 (2016)
institution DOAJ
collection DOAJ
language EN
topic 16S rRNA
high-throughput sequencing
microbial dark matter
microbiome
rare biosphere
taxonomic blind spots
Microbiology
QR1-502
spellingShingle 16S rRNA
high-throughput sequencing
microbial dark matter
microbiome
rare biosphere
taxonomic blind spots
Microbiology
QR1-502
Michael D. J. Lynch
Josh D. Neufeld
SSUnique: Detecting Sequence Novelty in Microbiome Surveys
description ABSTRACT High-throughput sequencing of small-subunit (SSU) rRNA genes has revolutionized understanding of microbial communities and facilitated investigations into ecological dynamics at unprecedented scales. Such extensive SSU rRNA gene sequence libraries, constructed from DNA extracts of environmental or host-associated samples, often contain a substantial proportion of unclassified sequences, many representing organisms with novel taxonomy (taxonomic “blind spots”) and potentially unique ecology. Indeed, these novel taxonomic lineages are associated with so-called microbial “dark matter,” which is the genomic potential of these lineages. Unfortunately, characterization beyond “unclassified” is challenging due to relatively short read lengths and large data set sizes. Here we demonstrate how mining of phylogenetically novel sequences from microbial ecosystems can be automated using SSUnique, a software pipeline that filters unclassified and/or rare operational taxonomic units (OTUs) from 16S rRNA gene sequence libraries by screening against consensus structural models for SSU rRNA. Phylogenetic position is inferred against a reference data set, and additional characterization of novel clades is also included, such as targeted probe/primer design and mining of assembled metagenomes for genomic context. We show how SSUnique reproduced a previous analysis of phylogenetic novelty from an Arctic tundra soil and demonstrate the recovery of highly novel clades from data sets associated with both the Earth Microbiome Project (EMP) and Human Microbiome Project (HMP). We anticipate that SSUnique will add to the expanding computational toolbox supporting high-throughput sequencing approaches for the study of microbial ecology and phylogeny. IMPORTANCE Extensive SSU rRNA gene sequence libraries, constructed from DNA extracts of environmental or host-associated samples, often contain many unclassified sequences, many representing organisms with novel taxonomy (taxonomic “blind spots”) and potentially unique ecology. This novelty is poorly explored in standard workflows, which narrows the breadth and discovery potential of such studies. Here we present the SSUnique analysis pipeline, which will promote the exploration of unclassified diversity in microbiome research and, importantly, enable the discovery of substantial novel taxonomic lineages through the analysis of a large variety of existing data sets.
format article
author Michael D. J. Lynch
Josh D. Neufeld
author_facet Michael D. J. Lynch
Josh D. Neufeld
author_sort Michael D. J. Lynch
title SSUnique: Detecting Sequence Novelty in Microbiome Surveys
title_short SSUnique: Detecting Sequence Novelty in Microbiome Surveys
title_full SSUnique: Detecting Sequence Novelty in Microbiome Surveys
title_fullStr SSUnique: Detecting Sequence Novelty in Microbiome Surveys
title_full_unstemmed SSUnique: Detecting Sequence Novelty in Microbiome Surveys
title_sort ssunique: detecting sequence novelty in microbiome surveys
publisher American Society for Microbiology
publishDate 2016
url https://doaj.org/article/ec8357535c4b40f2b4f0b8af56646b1b
work_keys_str_mv AT michaeldjlynch ssuniquedetectingsequencenoveltyinmicrobiomesurveys
AT joshdneufeld ssuniquedetectingsequencenoveltyinmicrobiomesurveys
_version_ 1718375998559354880