A statistical model for describing and simulating microbial community profiles.
Many methods have been developed for statistical analysis of microbial community profiles, but due to the complex nature of typical microbiome measurements (e.g. sparsity, zero-inflation, non-independence, and compositionality) and of the associated underlying biology, it is difficult to compare or...
Guardado en:
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/b5a61e321c084730884928cb547eac54 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:b5a61e321c084730884928cb547eac54 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:b5a61e321c084730884928cb547eac542021-12-02T19:57:49ZA statistical model for describing and simulating microbial community profiles.1553-734X1553-735810.1371/journal.pcbi.1008913https://doaj.org/article/b5a61e321c084730884928cb547eac542021-09-01T00:00:00Zhttps://doi.org/10.1371/journal.pcbi.1008913https://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Many methods have been developed for statistical analysis of microbial community profiles, but due to the complex nature of typical microbiome measurements (e.g. sparsity, zero-inflation, non-independence, and compositionality) and of the associated underlying biology, it is difficult to compare or evaluate such methods within a single systematic framework. To address this challenge, we developed SparseDOSSA (Sparse Data Observations for the Simulation of Synthetic Abundances): a statistical model of microbial ecological population structure, which can be used to parameterize real-world microbial community profiles and to simulate new, realistic profiles of known structure for methods evaluation. Specifically, SparseDOSSA's model captures marginal microbial feature abundances as a zero-inflated log-normal distribution, with additional model components for absolute cell counts and the sequence read generation process, microbe-microbe, and microbe-environment interactions. Together, these allow fully known covariance structure between synthetic features (i.e. "taxa") or between features and "phenotypes" to be simulated for method benchmarking. Here, we demonstrate SparseDOSSA's performance for 1) accurately modeling human-associated microbial population profiles; 2) generating synthetic communities with controlled population and ecological structures; 3) spiking-in true positive synthetic associations to benchmark analysis methods; and 4) recapitulating an end-to-end mouse microbiome feeding experiment. Together, these represent the most common analysis types in assessment of real microbial community environmental and epidemiological statistics, thus demonstrating SparseDOSSA's utility as a general-purpose aid for modeling communities and evaluating quantitative methods. An open-source implementation is available at http://huttenhower.sph.harvard.edu/sparsedossa2.Siyuan MaBoyu RenHimel MallickYo Sup MoonEmma SchwagerSagun MaharjanTimothy L TickleYiren LuRachel N CarmodyEric A FranzosaLucas JansonCurtis HuttenhowerPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 17, Iss 9, p e1008913 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Biology (General) QH301-705.5 |
spellingShingle |
Biology (General) QH301-705.5 Siyuan Ma Boyu Ren Himel Mallick Yo Sup Moon Emma Schwager Sagun Maharjan Timothy L Tickle Yiren Lu Rachel N Carmody Eric A Franzosa Lucas Janson Curtis Huttenhower A statistical model for describing and simulating microbial community profiles. |
description |
Many methods have been developed for statistical analysis of microbial community profiles, but due to the complex nature of typical microbiome measurements (e.g. sparsity, zero-inflation, non-independence, and compositionality) and of the associated underlying biology, it is difficult to compare or evaluate such methods within a single systematic framework. To address this challenge, we developed SparseDOSSA (Sparse Data Observations for the Simulation of Synthetic Abundances): a statistical model of microbial ecological population structure, which can be used to parameterize real-world microbial community profiles and to simulate new, realistic profiles of known structure for methods evaluation. Specifically, SparseDOSSA's model captures marginal microbial feature abundances as a zero-inflated log-normal distribution, with additional model components for absolute cell counts and the sequence read generation process, microbe-microbe, and microbe-environment interactions. Together, these allow fully known covariance structure between synthetic features (i.e. "taxa") or between features and "phenotypes" to be simulated for method benchmarking. Here, we demonstrate SparseDOSSA's performance for 1) accurately modeling human-associated microbial population profiles; 2) generating synthetic communities with controlled population and ecological structures; 3) spiking-in true positive synthetic associations to benchmark analysis methods; and 4) recapitulating an end-to-end mouse microbiome feeding experiment. Together, these represent the most common analysis types in assessment of real microbial community environmental and epidemiological statistics, thus demonstrating SparseDOSSA's utility as a general-purpose aid for modeling communities and evaluating quantitative methods. An open-source implementation is available at http://huttenhower.sph.harvard.edu/sparsedossa2. |
format |
article |
author |
Siyuan Ma Boyu Ren Himel Mallick Yo Sup Moon Emma Schwager Sagun Maharjan Timothy L Tickle Yiren Lu Rachel N Carmody Eric A Franzosa Lucas Janson Curtis Huttenhower |
author_facet |
Siyuan Ma Boyu Ren Himel Mallick Yo Sup Moon Emma Schwager Sagun Maharjan Timothy L Tickle Yiren Lu Rachel N Carmody Eric A Franzosa Lucas Janson Curtis Huttenhower |
author_sort |
Siyuan Ma |
title |
A statistical model for describing and simulating microbial community profiles. |
title_short |
A statistical model for describing and simulating microbial community profiles. |
title_full |
A statistical model for describing and simulating microbial community profiles. |
title_fullStr |
A statistical model for describing and simulating microbial community profiles. |
title_full_unstemmed |
A statistical model for describing and simulating microbial community profiles. |
title_sort |
statistical model for describing and simulating microbial community profiles. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2021 |
url |
https://doaj.org/article/b5a61e321c084730884928cb547eac54 |
work_keys_str_mv |
AT siyuanma astatisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT boyuren astatisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT himelmallick astatisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT yosupmoon astatisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT emmaschwager astatisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT sagunmaharjan astatisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT timothyltickle astatisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT yirenlu astatisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT rachelncarmody astatisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT ericafranzosa astatisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT lucasjanson astatisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT curtishuttenhower astatisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT siyuanma statisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT boyuren statisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT himelmallick statisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT yosupmoon statisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT emmaschwager statisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT sagunmaharjan statisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT timothyltickle statisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT yirenlu statisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT rachelncarmody statisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT ericafranzosa statisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT lucasjanson statisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles AT curtishuttenhower statisticalmodelfordescribingandsimulatingmicrobialcommunityprofiles |
_version_ |
1718375797854568448 |