To Dereplicate or Not To Dereplicate?
ABSTRACT Metagenome-assembled genomes (MAGs) expand our understanding of microbial diversity, evolution, and ecology. Concerns have been raised on how sequencing, assembly, binning, and quality assessment tools may result in MAGs that do not reflect single populations in nature. Here, we reflect on...
Guardado en:
Autores principales: | , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
American Society for Microbiology
2020
|
Materias: | |
Acceso en línea: | https://doaj.org/article/59970f89bc9b44a7abe2121e3000f68c |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:59970f89bc9b44a7abe2121e3000f68c |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:59970f89bc9b44a7abe2121e3000f68c2021-11-15T15:30:14ZTo Dereplicate or Not To Dereplicate?10.1128/mSphere.00971-192379-5042https://doaj.org/article/59970f89bc9b44a7abe2121e3000f68c2020-06-01T00:00:00Zhttps://journals.asm.org/doi/10.1128/mSphere.00971-19https://doaj.org/toc/2379-5042ABSTRACT Metagenome-assembled genomes (MAGs) expand our understanding of microbial diversity, evolution, and ecology. Concerns have been raised on how sequencing, assembly, binning, and quality assessment tools may result in MAGs that do not reflect single populations in nature. Here, we reflect on another issue, i.e., how to handle highly similar MAGs assembled from independent data sets. Obtaining multiple genomic representatives for a species is highly valuable, as it allows for population genomic analyses; however, when retaining genomes of closely related populations, it complicates MAG quality assessment and abundance inferences. We show that (i) published data sets contain a large fraction of MAGs sharing >99% average nucleotide identity, (ii) different software packages and parameters used to resolve this redundancy remove very different numbers of MAGs, and (iii) the removal of closely related genomes leads to losses of population-specific auxiliary genes. Finally, we highlight some approaches that can infer strain-specific dynamics across a sample series without dereplication.Jacob T. EvansVincent J. DenefAmerican Society for MicrobiologyarticleMAGbinningdereplicationmetagenomicspopulation genomicssoftwareMicrobiologyQR1-502ENmSphere, Vol 5, Iss 3 (2020) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
MAG binning dereplication metagenomics population genomics software Microbiology QR1-502 |
spellingShingle |
MAG binning dereplication metagenomics population genomics software Microbiology QR1-502 Jacob T. Evans Vincent J. Denef To Dereplicate or Not To Dereplicate? |
description |
ABSTRACT Metagenome-assembled genomes (MAGs) expand our understanding of microbial diversity, evolution, and ecology. Concerns have been raised on how sequencing, assembly, binning, and quality assessment tools may result in MAGs that do not reflect single populations in nature. Here, we reflect on another issue, i.e., how to handle highly similar MAGs assembled from independent data sets. Obtaining multiple genomic representatives for a species is highly valuable, as it allows for population genomic analyses; however, when retaining genomes of closely related populations, it complicates MAG quality assessment and abundance inferences. We show that (i) published data sets contain a large fraction of MAGs sharing >99% average nucleotide identity, (ii) different software packages and parameters used to resolve this redundancy remove very different numbers of MAGs, and (iii) the removal of closely related genomes leads to losses of population-specific auxiliary genes. Finally, we highlight some approaches that can infer strain-specific dynamics across a sample series without dereplication. |
format |
article |
author |
Jacob T. Evans Vincent J. Denef |
author_facet |
Jacob T. Evans Vincent J. Denef |
author_sort |
Jacob T. Evans |
title |
To Dereplicate or Not To Dereplicate? |
title_short |
To Dereplicate or Not To Dereplicate? |
title_full |
To Dereplicate or Not To Dereplicate? |
title_fullStr |
To Dereplicate or Not To Dereplicate? |
title_full_unstemmed |
To Dereplicate or Not To Dereplicate? |
title_sort |
to dereplicate or not to dereplicate? |
publisher |
American Society for Microbiology |
publishDate |
2020 |
url |
https://doaj.org/article/59970f89bc9b44a7abe2121e3000f68c |
work_keys_str_mv |
AT jacobtevans todereplicateornottodereplicate AT vincentjdenef todereplicateornottodereplicate |
_version_ |
1718427886574108672 |