The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.

Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activitie...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Adam Alexander Thil Smith, Eugeni Belda, Alain Viari, Claudine Medigue, David Vallenet
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2012
Materias:
Acceso en línea:https://doaj.org/article/0efaae956be847aaa9eaa1fde03aa1ea
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:0efaae956be847aaa9eaa1fde03aa1ea
record_format dspace
spelling oai:doaj.org-article:0efaae956be847aaa9eaa1fde03aa1ea2021-11-18T05:51:17ZThe CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.1553-734X1553-735810.1371/journal.pcbi.1002540https://doaj.org/article/0efaae956be847aaa9eaa1fde03aa1ea2012-05-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/22693442/pdf/?tool=EBIhttps://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activities by exploiting context-dependent rather than sequence-dependent data, and none are readily accessible and propose result integration across multiple genomes. Here, we present CanOE (Candidate genes for Orphan Enzymes), a four-step bioinformatics strategy that proposes ranked candidate genes for sequence-orphan enzymatic activities (or orphan enzymes for short). The first step locates "genomic metabolons", i.e. groups of co-localized genes coding proteins catalyzing reactions linked by shared metabolites, in one genome at a time. These metabolons can be particularly helpful for aiding bioanalysts to visualize relevant metabolic data. In the second step, they are used to generate candidate associations between un-annotated genes and gene-less reactions. The third step integrates these gene-reaction associations over several genomes using gene families, and summarizes the strength of family-reaction associations by several scores. In the final step, these scores are used to rank members of gene families which are proposed for metabolic reactions. These associations are of particular interest when the metabolic reaction is a sequence-orphan enzymatic activity. Our strategy found over 60,000 genomic metabolons in more than 1,000 prokaryote organisms from the MicroScope platform, generating candidate genes for many metabolic reactions, of which more than 70 distinct orphan reactions. A computational validation of the approach is discussed. Finally, we present a case study on the anaerobic allantoin degradation pathway in Escherichia coli K-12.Adam Alexander Thil SmithEugeni BeldaAlain ViariClaudine MedigueDavid VallenetPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 8, Iss 5, p e1002540 (2012)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Adam Alexander Thil Smith
Eugeni Belda
Alain Viari
Claudine Medigue
David Vallenet
The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.
description Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activities by exploiting context-dependent rather than sequence-dependent data, and none are readily accessible and propose result integration across multiple genomes. Here, we present CanOE (Candidate genes for Orphan Enzymes), a four-step bioinformatics strategy that proposes ranked candidate genes for sequence-orphan enzymatic activities (or orphan enzymes for short). The first step locates "genomic metabolons", i.e. groups of co-localized genes coding proteins catalyzing reactions linked by shared metabolites, in one genome at a time. These metabolons can be particularly helpful for aiding bioanalysts to visualize relevant metabolic data. In the second step, they are used to generate candidate associations between un-annotated genes and gene-less reactions. The third step integrates these gene-reaction associations over several genomes using gene families, and summarizes the strength of family-reaction associations by several scores. In the final step, these scores are used to rank members of gene families which are proposed for metabolic reactions. These associations are of particular interest when the metabolic reaction is a sequence-orphan enzymatic activity. Our strategy found over 60,000 genomic metabolons in more than 1,000 prokaryote organisms from the MicroScope platform, generating candidate genes for many metabolic reactions, of which more than 70 distinct orphan reactions. A computational validation of the approach is discussed. Finally, we present a case study on the anaerobic allantoin degradation pathway in Escherichia coli K-12.
format article
author Adam Alexander Thil Smith
Eugeni Belda
Alain Viari
Claudine Medigue
David Vallenet
author_facet Adam Alexander Thil Smith
Eugeni Belda
Alain Viari
Claudine Medigue
David Vallenet
author_sort Adam Alexander Thil Smith
title The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.
title_short The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.
title_full The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.
title_fullStr The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.
title_full_unstemmed The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.
title_sort canoe strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.
publisher Public Library of Science (PLoS)
publishDate 2012
url https://doaj.org/article/0efaae956be847aaa9eaa1fde03aa1ea
work_keys_str_mv AT adamalexanderthilsmith thecanoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT eugenibelda thecanoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT alainviari thecanoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT claudinemedigue thecanoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT davidvallenet thecanoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT adamalexanderthilsmith canoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT eugenibelda canoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT alainviari canoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT claudinemedigue canoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT davidvallenet canoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
_version_ 1718424711079133184