Relative model selection of evolutionary substitution models can be sensitive to multiple sequence alignment uncertainty

Abstract Background Multiple sequence alignments (MSAs) represent the fundamental unit of data inputted to most comparative sequence analyses. In phylogenetic analyses in particular, errors in MSA construction have the potential to induce further errors in downstream analyses such as phylogenetic re...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Stephanie J. Spielman, Molly L. Miraglia
Formato: article
Lenguaje:EN
Publicado: BMC 2021
Materias:
Acceso en línea:https://doaj.org/article/cd8c9535389743389e8f6802dc36226c
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:cd8c9535389743389e8f6802dc36226c
record_format dspace
spelling oai:doaj.org-article:cd8c9535389743389e8f6802dc36226c2021-12-05T12:04:13ZRelative model selection of evolutionary substitution models can be sensitive to multiple sequence alignment uncertainty10.1186/s12862-021-01931-52730-7182https://doaj.org/article/cd8c9535389743389e8f6802dc36226c2021-11-01T00:00:00Zhttps://doi.org/10.1186/s12862-021-01931-5https://doaj.org/toc/2730-7182Abstract Background Multiple sequence alignments (MSAs) represent the fundamental unit of data inputted to most comparative sequence analyses. In phylogenetic analyses in particular, errors in MSA construction have the potential to induce further errors in downstream analyses such as phylogenetic reconstruction itself, ancestral state reconstruction, and divergence time estimation. In addition to providing phylogenetic methods with an MSA to analyze, researchers must also specify a suitable evolutionary model for the given analysis. Most commonly, researchers apply relative model selection to select a model from candidate set and then provide both the MSA and the selected model as input to subsequent analyses. While the influence of MSA errors has been explored for most stages of phylogenetics pipelines, the potential effects of MSA uncertainty on the relative model selection procedure itself have not been explored. Results We assessed the consistency of relative model selection when presented with multiple perturbed versions of a given MSA. We find that while relative model selection is mostly robust to MSA uncertainty, in a substantial proportion of circumstances, relative model selection identifies distinct best-fitting models from different MSAs created from the same set of sequences. We find that this issue is more pervasive for nucleotide data compared to amino-acid data. However, we also find that it is challenging to predict whether relative model selection will be robust or sensitive to uncertainty in a given MSA. Conclusions We find that that MSA uncertainty can affect virtually all steps of phylogenetic analysis pipelines to a greater extent than has previously been recognized, including relative model selection.Stephanie J. SpielmanMolly L. MiragliaBMCarticleMultiple sequence alignmentPhylogeneticsRelative model selectionModels of sequence evolutionEcologyQH540-549.5EvolutionQH359-425ENBMC Ecology and Evolution, Vol 21, Iss 1, Pp 1-11 (2021)
institution DOAJ
collection DOAJ
language EN
topic Multiple sequence alignment
Phylogenetics
Relative model selection
Models of sequence evolution
Ecology
QH540-549.5
Evolution
QH359-425
spellingShingle Multiple sequence alignment
Phylogenetics
Relative model selection
Models of sequence evolution
Ecology
QH540-549.5
Evolution
QH359-425
Stephanie J. Spielman
Molly L. Miraglia
Relative model selection of evolutionary substitution models can be sensitive to multiple sequence alignment uncertainty
description Abstract Background Multiple sequence alignments (MSAs) represent the fundamental unit of data inputted to most comparative sequence analyses. In phylogenetic analyses in particular, errors in MSA construction have the potential to induce further errors in downstream analyses such as phylogenetic reconstruction itself, ancestral state reconstruction, and divergence time estimation. In addition to providing phylogenetic methods with an MSA to analyze, researchers must also specify a suitable evolutionary model for the given analysis. Most commonly, researchers apply relative model selection to select a model from candidate set and then provide both the MSA and the selected model as input to subsequent analyses. While the influence of MSA errors has been explored for most stages of phylogenetics pipelines, the potential effects of MSA uncertainty on the relative model selection procedure itself have not been explored. Results We assessed the consistency of relative model selection when presented with multiple perturbed versions of a given MSA. We find that while relative model selection is mostly robust to MSA uncertainty, in a substantial proportion of circumstances, relative model selection identifies distinct best-fitting models from different MSAs created from the same set of sequences. We find that this issue is more pervasive for nucleotide data compared to amino-acid data. However, we also find that it is challenging to predict whether relative model selection will be robust or sensitive to uncertainty in a given MSA. Conclusions We find that that MSA uncertainty can affect virtually all steps of phylogenetic analysis pipelines to a greater extent than has previously been recognized, including relative model selection.
format article
author Stephanie J. Spielman
Molly L. Miraglia
author_facet Stephanie J. Spielman
Molly L. Miraglia
author_sort Stephanie J. Spielman
title Relative model selection of evolutionary substitution models can be sensitive to multiple sequence alignment uncertainty
title_short Relative model selection of evolutionary substitution models can be sensitive to multiple sequence alignment uncertainty
title_full Relative model selection of evolutionary substitution models can be sensitive to multiple sequence alignment uncertainty
title_fullStr Relative model selection of evolutionary substitution models can be sensitive to multiple sequence alignment uncertainty
title_full_unstemmed Relative model selection of evolutionary substitution models can be sensitive to multiple sequence alignment uncertainty
title_sort relative model selection of evolutionary substitution models can be sensitive to multiple sequence alignment uncertainty
publisher BMC
publishDate 2021
url https://doaj.org/article/cd8c9535389743389e8f6802dc36226c
work_keys_str_mv AT stephaniejspielman relativemodelselectionofevolutionarysubstitutionmodelscanbesensitivetomultiplesequencealignmentuncertainty
AT mollylmiraglia relativemodelselectionofevolutionarysubstitutionmodelscanbesensitivetomultiplesequencealignmentuncertainty
_version_ 1718372281784205312