phyloFlash: Rapid Small-Subunit rRNA Profiling and Targeted Assembly from Metagenomes
ABSTRACT The small-subunit rRNA (SSU rRNA) gene is the key marker in molecular ecology for all domains of life, but it is largely absent from metagenome-assembled genomes that often are the only resource available for environmental microbes. Here, we present phyloFlash, a pipeline to overcome this g...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
American Society for Microbiology
2020
|
Materias: | |
Acceso en línea: | https://doaj.org/article/62e65737d0f24c2bbc340a5ce90c8a30 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:62e65737d0f24c2bbc340a5ce90c8a30 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:62e65737d0f24c2bbc340a5ce90c8a302021-12-02T19:47:36ZphyloFlash: Rapid Small-Subunit rRNA Profiling and Targeted Assembly from Metagenomes10.1128/mSystems.00920-202379-5077https://doaj.org/article/62e65737d0f24c2bbc340a5ce90c8a302020-10-01T00:00:00Zhttps://journals.asm.org/doi/10.1128/mSystems.00920-20https://doaj.org/toc/2379-5077ABSTRACT The small-subunit rRNA (SSU rRNA) gene is the key marker in molecular ecology for all domains of life, but it is largely absent from metagenome-assembled genomes that often are the only resource available for environmental microbes. Here, we present phyloFlash, a pipeline to overcome this gap with rapid, SSU rRNA-centered taxonomic classification, targeted assembly, and graph-based binning of full metagenomic assemblies. We show that a cleanup of artifacts is pivotal even with a curated reference database. With such a filtered database, the general-purpose mapper BBmap extracts SSU rRNA reads five times faster than the rRNA-specialized tool SortMeRNA with similar sensitivity and higher selectivity on simulated metagenomes. Reference-based targeted assemblers yielded either highly fragmented assemblies or high levels of chimerism, so we employ the general-purpose genomic assembler SPAdes. Our optimized implementation is independent of reference database composition and has satisfactory levels of chimera formation. phyloFlash quickly processes Illumina (meta)genomic data, is straightforward to use, even as part of high-throughput quality control, and has user-friendly output reports. The software is available at https://github.com/HRGV/phyloFlash (GPL3 license) and is documented with an online manual. IMPORTANCE To track organisms across all domains of life, the SSU rRNA gene is the gold standard. Many environmental microbes are known only from high-throughput sequence data, but the SSU rRNA gene, the key to visualization by molecular probes and link to existing literature, is often missing from metagenome-assembled genomes (MAGs). The easy-to-use phyloFlash software suite tackles this gap with rapid, SSU rRNA-centered taxonomic classification, targeted assembly, and graph-based linking to MAGs. Starting from a cleaned reference database, phyloFlash profiles the taxonomic diversity and assembles the sorted SSU rRNA reads. The phyloFlash design is domain agnostic and covers eukaryotes, archaea, and bacteria alike. phyloFlash also provides utilities to visualize multisample comparisons and to integrate the recovered SSU rRNAs in a metagenomics workflow by linking them to MAGs using assembly graph parsing.Harald R. Gruber-VodickaBrandon K. B. SeahElmar PruesseAmerican Society for MicrobiologyarticleSSUgene assemblymetagenomicstaxonomic profilingMicrobiologyQR1-502ENmSystems, Vol 5, Iss 5 (2020) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
SSU gene assembly metagenomics taxonomic profiling Microbiology QR1-502 |
spellingShingle |
SSU gene assembly metagenomics taxonomic profiling Microbiology QR1-502 Harald R. Gruber-Vodicka Brandon K. B. Seah Elmar Pruesse phyloFlash: Rapid Small-Subunit rRNA Profiling and Targeted Assembly from Metagenomes |
description |
ABSTRACT The small-subunit rRNA (SSU rRNA) gene is the key marker in molecular ecology for all domains of life, but it is largely absent from metagenome-assembled genomes that often are the only resource available for environmental microbes. Here, we present phyloFlash, a pipeline to overcome this gap with rapid, SSU rRNA-centered taxonomic classification, targeted assembly, and graph-based binning of full metagenomic assemblies. We show that a cleanup of artifacts is pivotal even with a curated reference database. With such a filtered database, the general-purpose mapper BBmap extracts SSU rRNA reads five times faster than the rRNA-specialized tool SortMeRNA with similar sensitivity and higher selectivity on simulated metagenomes. Reference-based targeted assemblers yielded either highly fragmented assemblies or high levels of chimerism, so we employ the general-purpose genomic assembler SPAdes. Our optimized implementation is independent of reference database composition and has satisfactory levels of chimera formation. phyloFlash quickly processes Illumina (meta)genomic data, is straightforward to use, even as part of high-throughput quality control, and has user-friendly output reports. The software is available at https://github.com/HRGV/phyloFlash (GPL3 license) and is documented with an online manual. IMPORTANCE To track organisms across all domains of life, the SSU rRNA gene is the gold standard. Many environmental microbes are known only from high-throughput sequence data, but the SSU rRNA gene, the key to visualization by molecular probes and link to existing literature, is often missing from metagenome-assembled genomes (MAGs). The easy-to-use phyloFlash software suite tackles this gap with rapid, SSU rRNA-centered taxonomic classification, targeted assembly, and graph-based linking to MAGs. Starting from a cleaned reference database, phyloFlash profiles the taxonomic diversity and assembles the sorted SSU rRNA reads. The phyloFlash design is domain agnostic and covers eukaryotes, archaea, and bacteria alike. phyloFlash also provides utilities to visualize multisample comparisons and to integrate the recovered SSU rRNAs in a metagenomics workflow by linking them to MAGs using assembly graph parsing. |
format |
article |
author |
Harald R. Gruber-Vodicka Brandon K. B. Seah Elmar Pruesse |
author_facet |
Harald R. Gruber-Vodicka Brandon K. B. Seah Elmar Pruesse |
author_sort |
Harald R. Gruber-Vodicka |
title |
phyloFlash: Rapid Small-Subunit rRNA Profiling and Targeted Assembly from Metagenomes |
title_short |
phyloFlash: Rapid Small-Subunit rRNA Profiling and Targeted Assembly from Metagenomes |
title_full |
phyloFlash: Rapid Small-Subunit rRNA Profiling and Targeted Assembly from Metagenomes |
title_fullStr |
phyloFlash: Rapid Small-Subunit rRNA Profiling and Targeted Assembly from Metagenomes |
title_full_unstemmed |
phyloFlash: Rapid Small-Subunit rRNA Profiling and Targeted Assembly from Metagenomes |
title_sort |
phyloflash: rapid small-subunit rrna profiling and targeted assembly from metagenomes |
publisher |
American Society for Microbiology |
publishDate |
2020 |
url |
https://doaj.org/article/62e65737d0f24c2bbc340a5ce90c8a30 |
work_keys_str_mv |
AT haraldrgrubervodicka phyloflashrapidsmallsubunitrrnaprofilingandtargetedassemblyfrommetagenomes AT brandonkbseah phyloflashrapidsmallsubunitrrnaprofilingandtargetedassemblyfrommetagenomes AT elmarpruesse phyloflashrapidsmallsubunitrrnaprofilingandtargetedassemblyfrommetagenomes |
_version_ |
1718375964469100544 |