RAxML and FastTree: comparing two methods for large-scale maximum likelihood phylogeny estimation.

Statistical methods for phylogeny estimation, especially maximum likelihood (ML), offer high accuracy with excellent theoretical properties. However, RAxML, the current leading method for large-scale ML estimation, can require weeks or longer when used on datasets with thousands of molecular sequenc...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Kevin Liu, C Randal Linder, Tandy Warnow
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2011
Materias:
R
Q
Acceso en línea:https://doaj.org/article/8f2d65a6ab3048e9a1d99f907dc77cec
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:8f2d65a6ab3048e9a1d99f907dc77cec
record_format dspace
spelling oai:doaj.org-article:8f2d65a6ab3048e9a1d99f907dc77cec2021-11-18T07:33:49ZRAxML and FastTree: comparing two methods for large-scale maximum likelihood phylogeny estimation.1932-620310.1371/journal.pone.0027731https://doaj.org/article/8f2d65a6ab3048e9a1d99f907dc77cec2011-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/22132132/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203Statistical methods for phylogeny estimation, especially maximum likelihood (ML), offer high accuracy with excellent theoretical properties. However, RAxML, the current leading method for large-scale ML estimation, can require weeks or longer when used on datasets with thousands of molecular sequences. Faster methods for ML estimation, among them FastTree, have also been developed, but their relative performance to RAxML is not yet fully understood. In this study, we explore the performance with respect to ML score, running time, and topological accuracy, of FastTree and RAxML on thousands of alignments (based on both simulated and biological nucleotide datasets) with up to 27,634 sequences. We find that when RAxML and FastTree are constrained to the same running time, FastTree produces topologically much more accurate trees in almost all cases. We also find that when RAxML is allowed to run to completion, it provides an advantage over FastTree in terms of the ML score, but does not produce substantially more accurate tree topologies. Interestingly, the relative accuracy of trees computed using FastTree and RAxML depends in part on the accuracy of the sequence alignment and dataset size, so that FastTree can be more accurate than RAxML on large datasets with relatively inaccurate alignments. Finally, the running times of RAxML and FastTree are dramatically different, so that when run to completion, RAxML can take several orders of magnitude longer than FastTree to complete. Thus, our study shows that very large phylogenies can be estimated very quickly using FastTree, with little (and in some cases no) degradation in tree accuracy, as compared to RAxML.Kevin LiuC Randal LinderTandy WarnowPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 6, Iss 11, p e27731 (2011)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Kevin Liu
C Randal Linder
Tandy Warnow
RAxML and FastTree: comparing two methods for large-scale maximum likelihood phylogeny estimation.
description Statistical methods for phylogeny estimation, especially maximum likelihood (ML), offer high accuracy with excellent theoretical properties. However, RAxML, the current leading method for large-scale ML estimation, can require weeks or longer when used on datasets with thousands of molecular sequences. Faster methods for ML estimation, among them FastTree, have also been developed, but their relative performance to RAxML is not yet fully understood. In this study, we explore the performance with respect to ML score, running time, and topological accuracy, of FastTree and RAxML on thousands of alignments (based on both simulated and biological nucleotide datasets) with up to 27,634 sequences. We find that when RAxML and FastTree are constrained to the same running time, FastTree produces topologically much more accurate trees in almost all cases. We also find that when RAxML is allowed to run to completion, it provides an advantage over FastTree in terms of the ML score, but does not produce substantially more accurate tree topologies. Interestingly, the relative accuracy of trees computed using FastTree and RAxML depends in part on the accuracy of the sequence alignment and dataset size, so that FastTree can be more accurate than RAxML on large datasets with relatively inaccurate alignments. Finally, the running times of RAxML and FastTree are dramatically different, so that when run to completion, RAxML can take several orders of magnitude longer than FastTree to complete. Thus, our study shows that very large phylogenies can be estimated very quickly using FastTree, with little (and in some cases no) degradation in tree accuracy, as compared to RAxML.
format article
author Kevin Liu
C Randal Linder
Tandy Warnow
author_facet Kevin Liu
C Randal Linder
Tandy Warnow
author_sort Kevin Liu
title RAxML and FastTree: comparing two methods for large-scale maximum likelihood phylogeny estimation.
title_short RAxML and FastTree: comparing two methods for large-scale maximum likelihood phylogeny estimation.
title_full RAxML and FastTree: comparing two methods for large-scale maximum likelihood phylogeny estimation.
title_fullStr RAxML and FastTree: comparing two methods for large-scale maximum likelihood phylogeny estimation.
title_full_unstemmed RAxML and FastTree: comparing two methods for large-scale maximum likelihood phylogeny estimation.
title_sort raxml and fasttree: comparing two methods for large-scale maximum likelihood phylogeny estimation.
publisher Public Library of Science (PLoS)
publishDate 2011
url https://doaj.org/article/8f2d65a6ab3048e9a1d99f907dc77cec
work_keys_str_mv AT kevinliu raxmlandfasttreecomparingtwomethodsforlargescalemaximumlikelihoodphylogenyestimation
AT crandallinder raxmlandfasttreecomparingtwomethodsforlargescalemaximumlikelihoodphylogenyestimation
AT tandywarnow raxmlandfasttreecomparingtwomethodsforlargescalemaximumlikelihoodphylogenyestimation
_version_ 1718423299853123584