Accelerated Profile HMM Searches.

Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Her...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autor principal: Sean R Eddy
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2011
Materias:
Acceso en línea:https://doaj.org/article/0f4a503f29d547e2b36619867aa54809
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:0f4a503f29d547e2b36619867aa54809
record_format dspace
spelling oai:doaj.org-article:0f4a503f29d547e2b36619867aa548092021-11-18T05:51:51ZAccelerated Profile HMM Searches.1553-734X1553-735810.1371/journal.pcbi.1002195https://doaj.org/article/0f4a503f29d547e2b36619867aa548092011-10-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/22039361/?tool=EBIhttps://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Here I describe an acceleration heuristic for profile HMMs, the "multiple segment Viterbi" (MSV) algorithm. The MSV algorithm computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment. MSV scores follow the same statistical distribution as gapped optimal local alignment scores, allowing rapid evaluation of significance of an MSV score and thus facilitating its use as a heuristic filter. I also describe a 20-fold acceleration of the standard profile HMM Forward/Backward algorithms using a method I call "sparse rescaling". These methods are assembled in a pipeline in which high-scoring MSV hits are passed on for reanalysis with the full HMM Forward/Backward algorithm. This accelerated pipeline is implemented in the freely available HMMER3 software package. Performance benchmarks show that the use of the heuristic MSV filter sacrifices negligible sensitivity compared to unaccelerated profile HMM searches. HMMER3 is substantially more sensitive and 100- to 1000-fold faster than HMMER2. HMMER3 is now about as fast as BLAST for protein searches.Sean R EddyPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 7, Iss 10, p e1002195 (2011)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Sean R Eddy
Accelerated Profile HMM Searches.
description Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Here I describe an acceleration heuristic for profile HMMs, the "multiple segment Viterbi" (MSV) algorithm. The MSV algorithm computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment. MSV scores follow the same statistical distribution as gapped optimal local alignment scores, allowing rapid evaluation of significance of an MSV score and thus facilitating its use as a heuristic filter. I also describe a 20-fold acceleration of the standard profile HMM Forward/Backward algorithms using a method I call "sparse rescaling". These methods are assembled in a pipeline in which high-scoring MSV hits are passed on for reanalysis with the full HMM Forward/Backward algorithm. This accelerated pipeline is implemented in the freely available HMMER3 software package. Performance benchmarks show that the use of the heuristic MSV filter sacrifices negligible sensitivity compared to unaccelerated profile HMM searches. HMMER3 is substantially more sensitive and 100- to 1000-fold faster than HMMER2. HMMER3 is now about as fast as BLAST for protein searches.
format article
author Sean R Eddy
author_facet Sean R Eddy
author_sort Sean R Eddy
title Accelerated Profile HMM Searches.
title_short Accelerated Profile HMM Searches.
title_full Accelerated Profile HMM Searches.
title_fullStr Accelerated Profile HMM Searches.
title_full_unstemmed Accelerated Profile HMM Searches.
title_sort accelerated profile hmm searches.
publisher Public Library of Science (PLoS)
publishDate 2011
url https://doaj.org/article/0f4a503f29d547e2b36619867aa54809
work_keys_str_mv AT seanreddy acceleratedprofilehmmsearches
_version_ 1718424711272071168