Functional biogeography of ocean microbes revealed through non-negative matrix factorization.

The direct "metagenomic" sequencing of genomic material from complex assemblages of bacteria, archaea, viruses and microeukaryotes has yielded new insights into the structure of microbial communities. For example, analysis of metagenomic data has revealed the existence of previously unknow...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Xingpeng Jiang, Morgan G I Langille, Russell Y Neches, Marie Elliot, Simon A Levin, Jonathan A Eisen, Joshua S Weitz, Jonathan Dushoff
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2012
Materias:
R
Q
Acceso en línea:https://doaj.org/article/aea925ab9c7548a8b5981877f9504bec
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:aea925ab9c7548a8b5981877f9504bec
record_format dspace
spelling oai:doaj.org-article:aea925ab9c7548a8b5981877f9504bec2021-11-18T08:14:17ZFunctional biogeography of ocean microbes revealed through non-negative matrix factorization.1932-620310.1371/journal.pone.0043866https://doaj.org/article/aea925ab9c7548a8b5981877f9504bec2012-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23049741/?tool=EBIhttps://doaj.org/toc/1932-6203The direct "metagenomic" sequencing of genomic material from complex assemblages of bacteria, archaea, viruses and microeukaryotes has yielded new insights into the structure of microbial communities. For example, analysis of metagenomic data has revealed the existence of previously unknown microbial taxa whose spatial distributions are limited by environmental conditions, ecological competition, and dispersal mechanisms. However, differences in genotypes that might lead biologists to designate two microbes as taxonomically distinct need not necessarily imply differences in ecological function. Hence, there is a growing need for large-scale analysis of the distribution of microbial function across habitats. Here, we present a framework for investigating the biogeography of microbial function by analyzing the distribution of protein families inferred from environmental sequence data across a global collection of sites. We map over 6,000,000 protein sequences from unassembled reads from the Global Ocean Survey dataset to [Formula: see text] protein families, generating a protein family relative abundance matrix that describes the distribution of each protein family across sites. We then use non-negative matrix factorization (NMF) to approximate these protein family profiles as linear combinations of a small number of ecological components. Each component has a characteristic functional profile and site profile. Our approach identifies common functional signatures within several of the components. We use our method as a filter to estimate functional distance between sites, and find that an NMF-filtered measure of functional distance is more strongly correlated with environmental distance than a comparable PCA-filtered measure. We also find that functional distance is more strongly correlated with environmental distance than with geographic distance, in agreement with prior studies. We identify similar protein functions in several components and suggest that functional co-occurrence across metagenomic samples could lead to future methods for de-novo functional prediction. We conclude by discussing how NMF, and other dimension reduction methods, can help enable a macroscopic functional description of marine ecosystems.Xingpeng JiangMorgan G I LangilleRussell Y NechesMarie ElliotSimon A LevinJonathan A EisenJoshua S WeitzJonathan DushoffPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 7, Iss 9, p e43866 (2012)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Xingpeng Jiang
Morgan G I Langille
Russell Y Neches
Marie Elliot
Simon A Levin
Jonathan A Eisen
Joshua S Weitz
Jonathan Dushoff
Functional biogeography of ocean microbes revealed through non-negative matrix factorization.
description The direct "metagenomic" sequencing of genomic material from complex assemblages of bacteria, archaea, viruses and microeukaryotes has yielded new insights into the structure of microbial communities. For example, analysis of metagenomic data has revealed the existence of previously unknown microbial taxa whose spatial distributions are limited by environmental conditions, ecological competition, and dispersal mechanisms. However, differences in genotypes that might lead biologists to designate two microbes as taxonomically distinct need not necessarily imply differences in ecological function. Hence, there is a growing need for large-scale analysis of the distribution of microbial function across habitats. Here, we present a framework for investigating the biogeography of microbial function by analyzing the distribution of protein families inferred from environmental sequence data across a global collection of sites. We map over 6,000,000 protein sequences from unassembled reads from the Global Ocean Survey dataset to [Formula: see text] protein families, generating a protein family relative abundance matrix that describes the distribution of each protein family across sites. We then use non-negative matrix factorization (NMF) to approximate these protein family profiles as linear combinations of a small number of ecological components. Each component has a characteristic functional profile and site profile. Our approach identifies common functional signatures within several of the components. We use our method as a filter to estimate functional distance between sites, and find that an NMF-filtered measure of functional distance is more strongly correlated with environmental distance than a comparable PCA-filtered measure. We also find that functional distance is more strongly correlated with environmental distance than with geographic distance, in agreement with prior studies. We identify similar protein functions in several components and suggest that functional co-occurrence across metagenomic samples could lead to future methods for de-novo functional prediction. We conclude by discussing how NMF, and other dimension reduction methods, can help enable a macroscopic functional description of marine ecosystems.
format article
author Xingpeng Jiang
Morgan G I Langille
Russell Y Neches
Marie Elliot
Simon A Levin
Jonathan A Eisen
Joshua S Weitz
Jonathan Dushoff
author_facet Xingpeng Jiang
Morgan G I Langille
Russell Y Neches
Marie Elliot
Simon A Levin
Jonathan A Eisen
Joshua S Weitz
Jonathan Dushoff
author_sort Xingpeng Jiang
title Functional biogeography of ocean microbes revealed through non-negative matrix factorization.
title_short Functional biogeography of ocean microbes revealed through non-negative matrix factorization.
title_full Functional biogeography of ocean microbes revealed through non-negative matrix factorization.
title_fullStr Functional biogeography of ocean microbes revealed through non-negative matrix factorization.
title_full_unstemmed Functional biogeography of ocean microbes revealed through non-negative matrix factorization.
title_sort functional biogeography of ocean microbes revealed through non-negative matrix factorization.
publisher Public Library of Science (PLoS)
publishDate 2012
url https://doaj.org/article/aea925ab9c7548a8b5981877f9504bec
work_keys_str_mv AT xingpengjiang functionalbiogeographyofoceanmicrobesrevealedthroughnonnegativematrixfactorization
AT morgangilangille functionalbiogeographyofoceanmicrobesrevealedthroughnonnegativematrixfactorization
AT russellyneches functionalbiogeographyofoceanmicrobesrevealedthroughnonnegativematrixfactorization
AT marieelliot functionalbiogeographyofoceanmicrobesrevealedthroughnonnegativematrixfactorization
AT simonalevin functionalbiogeographyofoceanmicrobesrevealedthroughnonnegativematrixfactorization
AT jonathanaeisen functionalbiogeographyofoceanmicrobesrevealedthroughnonnegativematrixfactorization
AT joshuasweitz functionalbiogeographyofoceanmicrobesrevealedthroughnonnegativematrixfactorization
AT jonathandushoff functionalbiogeographyofoceanmicrobesrevealedthroughnonnegativematrixfactorization
_version_ 1718422065449533440