MetaPalette: a <italic toggle="yes">k</italic>-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation

ABSTRACT Metagenomic profiling is challenging in part because of the highly uneven sampling of the tree of life by genome sequencing projects and the limitations imposed by performing phylogenetic inference at fixed taxonomic ranks. We present the algorithm MetaPalette, which uses long k-mer sizes (...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: David Koslicki, Daniel Falush
Formato: article
Lenguaje:EN
Publicado: American Society for Microbiology 2016
Materias:
Acceso en línea:https://doaj.org/article/e78a8718c7fa438ca1aa8c5cc8c2b0a0
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:e78a8718c7fa438ca1aa8c5cc8c2b0a0
record_format dspace
spelling oai:doaj.org-article:e78a8718c7fa438ca1aa8c5cc8c2b0a02021-12-02T18:39:34ZMetaPalette: a <italic toggle="yes">k</italic>-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation10.1128/mSystems.00020-162379-5077https://doaj.org/article/e78a8718c7fa438ca1aa8c5cc8c2b0a02016-06-01T00:00:00Zhttps://journals.asm.org/doi/10.1128/mSystems.00020-16https://doaj.org/toc/2379-5077ABSTRACT Metagenomic profiling is challenging in part because of the highly uneven sampling of the tree of life by genome sequencing projects and the limitations imposed by performing phylogenetic inference at fixed taxonomic ranks. We present the algorithm MetaPalette, which uses long k-mer sizes (k = 30, 50) to fit a k-mer “palette” of a given sample to the k-mer palette of reference organisms. By modeling the k-mer palettes of unknown organisms, the method also gives an indication of the presence, abundance, and evolutionary relatedness of novel organisms present in the sample. The method returns a traditional, fixed-rank taxonomic profile which is shown on independently simulated data to be one of the most accurate to date. Tree figures are also returned that quantify the relatedness of novel organisms to reference sequences, and the accuracy of such figures is demonstrated on simulated spike-ins and a metagenomic soil sample. The software implementing MetaPalette is available at: https://github.com/dkoslicki/MetaPalette . Pretrained databases are included for Archaea, Bacteria, Eukaryota, and viruses. IMPORTANCE Taxonomic profiling is a challenging first step when analyzing a metagenomic sample. This work presents a method that facilitates fine-scale characterization of the presence, abundance, and evolutionary relatedness of organisms present in a given sample but absent from the training database. We calculate a “k-mer palette” which summarizes the information from all reads, not just those in conserved genes or containing taxon-specific markers. The compositions of palettes are easy to model, allowing rapid inference of community composition. In addition to providing strain-level information where applicable, our approach provides taxonomic profiles that are more accurate than those of competing methods. Author Video: An author video summary of this article is available.David KoslickiDaniel FalushAmerican Society for Microbiologyarticletaxonomic profilingmetagenomicsquantitative methodsMicrobiologyQR1-502ENmSystems, Vol 1, Iss 3 (2016)
institution DOAJ
collection DOAJ
language EN
topic taxonomic profiling
metagenomics
quantitative methods
Microbiology
QR1-502
spellingShingle taxonomic profiling
metagenomics
quantitative methods
Microbiology
QR1-502
David Koslicki
Daniel Falush
MetaPalette: a <italic toggle="yes">k</italic>-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation
description ABSTRACT Metagenomic profiling is challenging in part because of the highly uneven sampling of the tree of life by genome sequencing projects and the limitations imposed by performing phylogenetic inference at fixed taxonomic ranks. We present the algorithm MetaPalette, which uses long k-mer sizes (k = 30, 50) to fit a k-mer “palette” of a given sample to the k-mer palette of reference organisms. By modeling the k-mer palettes of unknown organisms, the method also gives an indication of the presence, abundance, and evolutionary relatedness of novel organisms present in the sample. The method returns a traditional, fixed-rank taxonomic profile which is shown on independently simulated data to be one of the most accurate to date. Tree figures are also returned that quantify the relatedness of novel organisms to reference sequences, and the accuracy of such figures is demonstrated on simulated spike-ins and a metagenomic soil sample. The software implementing MetaPalette is available at: https://github.com/dkoslicki/MetaPalette . Pretrained databases are included for Archaea, Bacteria, Eukaryota, and viruses. IMPORTANCE Taxonomic profiling is a challenging first step when analyzing a metagenomic sample. This work presents a method that facilitates fine-scale characterization of the presence, abundance, and evolutionary relatedness of organisms present in a given sample but absent from the training database. We calculate a “k-mer palette” which summarizes the information from all reads, not just those in conserved genes or containing taxon-specific markers. The compositions of palettes are easy to model, allowing rapid inference of community composition. In addition to providing strain-level information where applicable, our approach provides taxonomic profiles that are more accurate than those of competing methods. Author Video: An author video summary of this article is available.
format article
author David Koslicki
Daniel Falush
author_facet David Koslicki
Daniel Falush
author_sort David Koslicki
title MetaPalette: a <italic toggle="yes">k</italic>-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation
title_short MetaPalette: a <italic toggle="yes">k</italic>-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation
title_full MetaPalette: a <italic toggle="yes">k</italic>-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation
title_fullStr MetaPalette: a <italic toggle="yes">k</italic>-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation
title_full_unstemmed MetaPalette: a <italic toggle="yes">k</italic>-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation
title_sort metapalette: a <italic toggle="yes">k</italic>-mer painting approach for metagenomic taxonomic profiling and quantification of novel strain variation
publisher American Society for Microbiology
publishDate 2016
url https://doaj.org/article/e78a8718c7fa438ca1aa8c5cc8c2b0a0
work_keys_str_mv AT davidkoslicki metapaletteaitalictoggleyeskitalicmerpaintingapproachformetagenomictaxonomicprofilingandquantificationofnovelstrainvariation
AT danielfalush metapaletteaitalictoggleyeskitalicmerpaintingapproachformetagenomictaxonomicprofilingandquantificationofnovelstrainvariation
_version_ 1718377773967343616