Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics

In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC)...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Sebastian Höhna, Michael J. Landis, John P. Huelsenbeck
Formato: article
Lenguaje:EN
Publicado: PeerJ Inc. 2021
Materias:
R
Acceso en línea:https://doaj.org/article/f0d5f15a6f444068a4bc767018c0774d
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:f0d5f15a6f444068a4bc767018c0774d
record_format dspace
spelling oai:doaj.org-article:f0d5f15a6f444068a4bc767018c0774d2021-11-04T15:05:35ZParallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics10.7717/peerj.124382167-8359https://doaj.org/article/f0d5f15a6f444068a4bc767018c0774d2021-11-01T00:00:00Zhttps://peerj.com/articles/12438.pdfhttps://peerj.com/articles/12438/https://doaj.org/toc/2167-8359In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC) simulations. Here we introduce a general parallelization strategy that distributes the power posterior MCMC simulations and the likelihood computations over available CPUs. Our parallelization strategy can easily be applied to any statistical model despite our primary focus on molecular substitution models in this study. Using two phylogenetic example datasets, we demonstrate that the runtime of the marginal likelihood estimation can be reduced significantly even if only two CPUs are available (an average performance increase of 1.96x). The performance increase is nearly linear with the number of available CPUs. We record a performance increase of 13.3x for cluster nodes with 16 CPUs, representing a substantial reduction to the runtime of marginal likelihood estimations. Hence, our parallelization strategy enables the estimation of marginal likelihoods to complete in a feasible amount of time which previously needed days, weeks or even months. The methods described here are implemented in our open-source software RevBayes which is available from http://www.RevBayes.com.Sebastian HöhnaMichael J. LandisJohn P. HuelsenbeckPeerJ Inc.articleBayes factorParallelizationPhylogeneticsMedicineRENPeerJ, Vol 9, p e12438 (2021)
institution DOAJ
collection DOAJ
language EN
topic Bayes factor
Parallelization
Phylogenetics
Medicine
R
spellingShingle Bayes factor
Parallelization
Phylogenetics
Medicine
R
Sebastian Höhna
Michael J. Landis
John P. Huelsenbeck
Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
description In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC) simulations. Here we introduce a general parallelization strategy that distributes the power posterior MCMC simulations and the likelihood computations over available CPUs. Our parallelization strategy can easily be applied to any statistical model despite our primary focus on molecular substitution models in this study. Using two phylogenetic example datasets, we demonstrate that the runtime of the marginal likelihood estimation can be reduced significantly even if only two CPUs are available (an average performance increase of 1.96x). The performance increase is nearly linear with the number of available CPUs. We record a performance increase of 13.3x for cluster nodes with 16 CPUs, representing a substantial reduction to the runtime of marginal likelihood estimations. Hence, our parallelization strategy enables the estimation of marginal likelihoods to complete in a feasible amount of time which previously needed days, weeks or even months. The methods described here are implemented in our open-source software RevBayes which is available from http://www.RevBayes.com.
format article
author Sebastian Höhna
Michael J. Landis
John P. Huelsenbeck
author_facet Sebastian Höhna
Michael J. Landis
John P. Huelsenbeck
author_sort Sebastian Höhna
title Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_short Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_full Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_fullStr Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_full_unstemmed Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
title_sort parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics
publisher PeerJ Inc.
publishDate 2021
url https://doaj.org/article/f0d5f15a6f444068a4bc767018c0774d
work_keys_str_mv AT sebastianhohna parallelpowerposterioranalysesforfastcomputationofmarginallikelihoodsinphylogenetics
AT michaeljlandis parallelpowerposterioranalysesforfastcomputationofmarginallikelihoodsinphylogenetics
AT johnphuelsenbeck parallelpowerposterioranalysesforfastcomputationofmarginallikelihoodsinphylogenetics
_version_ 1718444809135325184