The impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales.

Phylogenetic analyses of DNA sequence data can provide estimates of evolutionary rates and timescales. Nearly all phylogenetic methods rely on accurate models of nucleotide substitution. A key feature of molecular evolution is the heterogeneity of substitution rates among sites, which is often model...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Fangzhi Jia, Nathan Lo, Simon Y W Ho
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2014
Materias:
R
Q
Acceso en línea:https://doaj.org/article/afcb0fdb820a47e1a1c30e232f39895a
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:afcb0fdb820a47e1a1c30e232f39895a
record_format dspace
spelling oai:doaj.org-article:afcb0fdb820a47e1a1c30e232f39895a2021-11-18T08:20:44ZThe impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales.1932-620310.1371/journal.pone.0095722https://doaj.org/article/afcb0fdb820a47e1a1c30e232f39895a2014-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24798481/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203Phylogenetic analyses of DNA sequence data can provide estimates of evolutionary rates and timescales. Nearly all phylogenetic methods rely on accurate models of nucleotide substitution. A key feature of molecular evolution is the heterogeneity of substitution rates among sites, which is often modelled using a discrete gamma distribution. A widely used derivative of this is the gamma-invariable mixture model, which assumes that a proportion of sites in the sequence are completely resistant to change, while substitution rates at the remaining sites are gamma-distributed. For data sampled at the intraspecific level, however, biological assumptions involved in the invariable-sites model are commonly violated. We examined the use of these models in analyses of five intraspecific data sets. We show that using 6-10 rate categories for the discrete gamma distribution of rates among sites is sufficient to provide a good approximation of the marginal likelihood. Increasing the number of gamma rate categories did not have a substantial effect on estimates of the substitution rate or coalescence time, unless rates varied strongly among sites in a non-gamma-distributed manner. The assumption of a proportion of invariable sites provided a better approximation of the asymptotic marginal likelihood when the number of gamma categories was small, but had minimal impact on estimates of rates and coalescence times. However, the estimated proportion of invariable sites was highly susceptible to changes in the number of gamma rate categories. The concurrent use of gamma and invariable-site models for intraspecific data is not biologically meaningful and has been challenged on statistical grounds; here we have found that the assumption of a proportion of invariable sites has no obvious impact on Bayesian estimates of rates and timescales from intraspecific data.Fangzhi JiaNathan LoSimon Y W HoPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 9, Iss 5, p e95722 (2014)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Fangzhi Jia
Nathan Lo
Simon Y W Ho
The impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales.
description Phylogenetic analyses of DNA sequence data can provide estimates of evolutionary rates and timescales. Nearly all phylogenetic methods rely on accurate models of nucleotide substitution. A key feature of molecular evolution is the heterogeneity of substitution rates among sites, which is often modelled using a discrete gamma distribution. A widely used derivative of this is the gamma-invariable mixture model, which assumes that a proportion of sites in the sequence are completely resistant to change, while substitution rates at the remaining sites are gamma-distributed. For data sampled at the intraspecific level, however, biological assumptions involved in the invariable-sites model are commonly violated. We examined the use of these models in analyses of five intraspecific data sets. We show that using 6-10 rate categories for the discrete gamma distribution of rates among sites is sufficient to provide a good approximation of the marginal likelihood. Increasing the number of gamma rate categories did not have a substantial effect on estimates of the substitution rate or coalescence time, unless rates varied strongly among sites in a non-gamma-distributed manner. The assumption of a proportion of invariable sites provided a better approximation of the asymptotic marginal likelihood when the number of gamma categories was small, but had minimal impact on estimates of rates and coalescence times. However, the estimated proportion of invariable sites was highly susceptible to changes in the number of gamma rate categories. The concurrent use of gamma and invariable-site models for intraspecific data is not biologically meaningful and has been challenged on statistical grounds; here we have found that the assumption of a proportion of invariable sites has no obvious impact on Bayesian estimates of rates and timescales from intraspecific data.
format article
author Fangzhi Jia
Nathan Lo
Simon Y W Ho
author_facet Fangzhi Jia
Nathan Lo
Simon Y W Ho
author_sort Fangzhi Jia
title The impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales.
title_short The impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales.
title_full The impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales.
title_fullStr The impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales.
title_full_unstemmed The impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales.
title_sort impact of modelling rate heterogeneity among sites on phylogenetic estimates of intraspecific evolutionary rates and timescales.
publisher Public Library of Science (PLoS)
publishDate 2014
url https://doaj.org/article/afcb0fdb820a47e1a1c30e232f39895a
work_keys_str_mv AT fangzhijia theimpactofmodellingrateheterogeneityamongsitesonphylogeneticestimatesofintraspecificevolutionaryratesandtimescales
AT nathanlo theimpactofmodellingrateheterogeneityamongsitesonphylogeneticestimatesofintraspecificevolutionaryratesandtimescales
AT simonywho theimpactofmodellingrateheterogeneityamongsitesonphylogeneticestimatesofintraspecificevolutionaryratesandtimescales
AT fangzhijia impactofmodellingrateheterogeneityamongsitesonphylogeneticestimatesofintraspecificevolutionaryratesandtimescales
AT nathanlo impactofmodellingrateheterogeneityamongsitesonphylogeneticestimatesofintraspecificevolutionaryratesandtimescales
AT simonywho impactofmodellingrateheterogeneityamongsitesonphylogeneticestimatesofintraspecificevolutionaryratesandtimescales
_version_ 1718421881593266176