HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.

Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share su...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Charles Richard Bradshaw, Vineeth Surendranath, Robert Henschel, Matthias Stefan Mueller, Bianca Hermine Habermann
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2011
Materias:
R
Q
Acceso en línea:https://doaj.org/article/141019c298e24b82a0f5faed12963047
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:141019c298e24b82a0f5faed12963047
record_format dspace
spelling oai:doaj.org-article:141019c298e24b82a0f5faed129630472021-11-18T06:57:28ZHMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.1932-620310.1371/journal.pone.0017568https://doaj.org/article/141019c298e24b82a0f5faed129630472011-03-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21423752/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de.Charles Richard BradshawVineeth SurendranathRobert HenschelMatthias Stefan MuellerBianca Hermine HabermannPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 6, Iss 3, p e17568 (2011)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Charles Richard Bradshaw
Vineeth Surendranath
Robert Henschel
Matthias Stefan Mueller
Bianca Hermine Habermann
HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.
description Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de.
format article
author Charles Richard Bradshaw
Vineeth Surendranath
Robert Henschel
Matthias Stefan Mueller
Bianca Hermine Habermann
author_facet Charles Richard Bradshaw
Vineeth Surendranath
Robert Henschel
Matthias Stefan Mueller
Bianca Hermine Habermann
author_sort Charles Richard Bradshaw
title HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.
title_short HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.
title_full HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.
title_fullStr HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.
title_full_unstemmed HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.
title_sort hmmerthread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.
publisher Public Library of Science (PLoS)
publishDate 2011
url https://doaj.org/article/141019c298e24b82a0f5faed12963047
work_keys_str_mv AT charlesrichardbradshaw hmmerthreaddetectingremotefunctionalconserveddomainsinentiregenomesbycombiningrelaxedsequencedatabasesearcheswithfoldrecognition
AT vineethsurendranath hmmerthreaddetectingremotefunctionalconserveddomainsinentiregenomesbycombiningrelaxedsequencedatabasesearcheswithfoldrecognition
AT roberthenschel hmmerthreaddetectingremotefunctionalconserveddomainsinentiregenomesbycombiningrelaxedsequencedatabasesearcheswithfoldrecognition
AT matthiasstefanmueller hmmerthreaddetectingremotefunctionalconserveddomainsinentiregenomesbycombiningrelaxedsequencedatabasesearcheswithfoldrecognition
AT biancaherminehabermann hmmerthreaddetectingremotefunctionalconserveddomainsinentiregenomesbycombiningrelaxedsequencedatabasesearcheswithfoldrecognition
_version_ 1718424150666641408