Multiple-Disease Detection and Classification across Cohorts via Microbiome Search

ABSTRACT Microbiome-based disease classification depends on well-validated disease-specific models or a priori organismal markers. These are lacking for many diseases. Here, we present an alternative, search-based strategy for disease detection and classification, which detects diseased samples via...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Xiaoquan Su, Gongchao Jing, Zheng Sun, Lu Liu, Zhenjiang Xu, Daniel McDonald, Zengbin Wang, Honglei Wang, Antonio Gonzalez, Yufeng Zhang, Shi Huang, Gavin Huttley, Rob Knight, Jian Xu
Formato: article
Lenguaje:EN
Publicado: American Society for Microbiology 2020
Materias:
Acceso en línea:https://doaj.org/article/6392988b2a2f43b491e1fb8de1f441c5
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:ABSTRACT Microbiome-based disease classification depends on well-validated disease-specific models or a priori organismal markers. These are lacking for many diseases. Here, we present an alternative, search-based strategy for disease detection and classification, which detects diseased samples via their outlier novelty versus a database of samples from healthy subjects and then compares these to databases of samples from patients. Our strategy’s precision, sensitivity, and speed outperform model-based approaches. In addition, it is more robust to platform heterogeneity and to contamination in 16S rRNA gene amplicon data sets. This search-based strategy shows promise as an important first step in microbiome big-data-based diagnosis. IMPORTANCE Here, we present a search-based strategy for disease detection and classification, which detects diseased samples via their outlier novelty versus a database of samples from healthy subjects and then compares them to databases of samples from patients. This approach enables the identification of microbiome states associated with disease even in the presence of different cohorts, multiple sequencing platforms, or significant contamination.