Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities

ABSTRACT The site-to-site variability in species composition, known as β-diversity, is crucial to understanding spatiotemporal patterns of species diversity and the mechanisms controlling community composition and structure. However, quantifying β-diversity in microbial ecology using sequencing-base...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Jizhong Zhou, Yi-Huei Jiang, Ye Deng, Zhou Shi, Benjamin Yamin Zhou, Kai Xue, Liyou Wu, Zhili He, Yunfeng Yang
Formato: article
Lenguaje:EN
Publicado: American Society for Microbiology 2013
Materias:
Acceso en línea:https://doaj.org/article/512a59d285644ddb8a7f0df0443b193b
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:512a59d285644ddb8a7f0df0443b193b
record_format dspace
spelling oai:doaj.org-article:512a59d285644ddb8a7f0df0443b193b2021-11-15T15:40:06ZRandom Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities10.1128/mBio.00324-132150-7511https://doaj.org/article/512a59d285644ddb8a7f0df0443b193b2013-07-01T00:00:00Zhttps://journals.asm.org/doi/10.1128/mBio.00324-13https://doaj.org/toc/2150-7511ABSTRACT The site-to-site variability in species composition, known as β-diversity, is crucial to understanding spatiotemporal patterns of species diversity and the mechanisms controlling community composition and structure. However, quantifying β-diversity in microbial ecology using sequencing-based technologies is a great challenge because of a high number of sequencing errors, bias, and poor reproducibility and quantification. Herein, based on general sampling theory, a mathematical framework is first developed for simulating the effects of random sampling processes on quantifying β-diversity when the community size is known or unknown. Also, using an analogous ball example under Poisson sampling with limited sampling efforts, the developed mathematical framework can exactly predict the low reproducibility among technically replicate samples from the same community of a certain species abundance distribution, which provides explicit evidences of random sampling processes as the main factor causing high percentages of technical variations. In addition, the predicted values under Poisson random sampling were highly consistent with the observed low percentages of operational taxonomic unit (OTU) overlap (<30% and <20% for two and three tags, respectively, based on both Jaccard and Bray-Curtis dissimilarity indexes), further supporting the hypothesis that the poor reproducibility among technical replicates is due to the artifacts associated with random sampling processes. Finally, a mathematical framework was developed for predicting sampling efforts to achieve a desired overlap among replicate samples. Our modeling simulations predict that several orders of magnitude more sequencing efforts are needed to achieve desired high technical reproducibility. These results suggest that great caution needs to be taken in quantifying and interpreting β-diversity for microbial community analysis using next-generation sequencing technologies. IMPORTANCE Due to the vast diversity and uncultivated status of the majority of microorganisms, microbial detection, characterization, and quantitation are of great challenge. Although large-scale metagenome sequencing technology such as PCR-based amplicon sequencing has revolutionized the studies of microbial communities, it suffers from several inherent drawbacks, such as a high number of sequencing errors, biases, poor quantitation, and very high percentages of technical variations, which could greatly overestimate microbial biodiversity. Based on general sampling theory, this study provided the first explicit evidence to demonstrate the importance of random sampling processes in estimating microbial β-diversity, which has not been adequately recognized and addressed in microbial ecology. Since most ecological studies are involved in random sampling, the conclusions learned from this study should also be applicable to other ecological studies in general. In summary, the results presented in this study should have important implications for examining microbial biodiversity to address both basic theoretical and applied management questions.Jizhong ZhouYi-Huei JiangYe DengZhou ShiBenjamin Yamin ZhouKai XueLiyou WuZhili HeYunfeng YangAmerican Society for MicrobiologyarticleMicrobiologyQR1-502ENmBio, Vol 4, Iss 3 (2013)
institution DOAJ
collection DOAJ
language EN
topic Microbiology
QR1-502
spellingShingle Microbiology
QR1-502
Jizhong Zhou
Yi-Huei Jiang
Ye Deng
Zhou Shi
Benjamin Yamin Zhou
Kai Xue
Liyou Wu
Zhili He
Yunfeng Yang
Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities
description ABSTRACT The site-to-site variability in species composition, known as β-diversity, is crucial to understanding spatiotemporal patterns of species diversity and the mechanisms controlling community composition and structure. However, quantifying β-diversity in microbial ecology using sequencing-based technologies is a great challenge because of a high number of sequencing errors, bias, and poor reproducibility and quantification. Herein, based on general sampling theory, a mathematical framework is first developed for simulating the effects of random sampling processes on quantifying β-diversity when the community size is known or unknown. Also, using an analogous ball example under Poisson sampling with limited sampling efforts, the developed mathematical framework can exactly predict the low reproducibility among technically replicate samples from the same community of a certain species abundance distribution, which provides explicit evidences of random sampling processes as the main factor causing high percentages of technical variations. In addition, the predicted values under Poisson random sampling were highly consistent with the observed low percentages of operational taxonomic unit (OTU) overlap (<30% and <20% for two and three tags, respectively, based on both Jaccard and Bray-Curtis dissimilarity indexes), further supporting the hypothesis that the poor reproducibility among technical replicates is due to the artifacts associated with random sampling processes. Finally, a mathematical framework was developed for predicting sampling efforts to achieve a desired overlap among replicate samples. Our modeling simulations predict that several orders of magnitude more sequencing efforts are needed to achieve desired high technical reproducibility. These results suggest that great caution needs to be taken in quantifying and interpreting β-diversity for microbial community analysis using next-generation sequencing technologies. IMPORTANCE Due to the vast diversity and uncultivated status of the majority of microorganisms, microbial detection, characterization, and quantitation are of great challenge. Although large-scale metagenome sequencing technology such as PCR-based amplicon sequencing has revolutionized the studies of microbial communities, it suffers from several inherent drawbacks, such as a high number of sequencing errors, biases, poor quantitation, and very high percentages of technical variations, which could greatly overestimate microbial biodiversity. Based on general sampling theory, this study provided the first explicit evidence to demonstrate the importance of random sampling processes in estimating microbial β-diversity, which has not been adequately recognized and addressed in microbial ecology. Since most ecological studies are involved in random sampling, the conclusions learned from this study should also be applicable to other ecological studies in general. In summary, the results presented in this study should have important implications for examining microbial biodiversity to address both basic theoretical and applied management questions.
format article
author Jizhong Zhou
Yi-Huei Jiang
Ye Deng
Zhou Shi
Benjamin Yamin Zhou
Kai Xue
Liyou Wu
Zhili He
Yunfeng Yang
author_facet Jizhong Zhou
Yi-Huei Jiang
Ye Deng
Zhou Shi
Benjamin Yamin Zhou
Kai Xue
Liyou Wu
Zhili He
Yunfeng Yang
author_sort Jizhong Zhou
title Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities
title_short Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities
title_full Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities
title_fullStr Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities
title_full_unstemmed Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities
title_sort random sampling process leads to overestimation of β-diversity of microbial communities
publisher American Society for Microbiology
publishDate 2013
url https://doaj.org/article/512a59d285644ddb8a7f0df0443b193b
work_keys_str_mv AT jizhongzhou randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT yihueijiang randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT yedeng randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT zhoushi randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT benjaminyaminzhou randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT kaixue randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT liyouwu randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT zhilihe randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
AT yunfengyang randomsamplingprocessleadstooverestimationofbdiversityofmicrobialcommunities
_version_ 1718427764999061504