MSA: reproducible mutational signature attribution with confidence based on simulations
Abstract Background Mutational signatures proved to be a useful tool for identifying patterns of mutations in genomes, often providing valuable insights about mutagenic processes or normal DNA damage. De novo extraction of signatures is commonly performed using Non-Negative Matrix Factorisation meth...
Guardado en:
Autor principal: | |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
BMC
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/d101f8dfde724132b54e2266d9c6cf30 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:d101f8dfde724132b54e2266d9c6cf30 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:d101f8dfde724132b54e2266d9c6cf302021-11-07T12:22:16ZMSA: reproducible mutational signature attribution with confidence based on simulations10.1186/s12859-021-04450-81471-2105https://doaj.org/article/d101f8dfde724132b54e2266d9c6cf302021-11-01T00:00:00Zhttps://doi.org/10.1186/s12859-021-04450-8https://doaj.org/toc/1471-2105Abstract Background Mutational signatures proved to be a useful tool for identifying patterns of mutations in genomes, often providing valuable insights about mutagenic processes or normal DNA damage. De novo extraction of signatures is commonly performed using Non-Negative Matrix Factorisation methods, however, accurate attribution of these signatures to individual samples is a distinct problem requiring uncertainty estimation, particularly in noisy scenarios or when the acting signatures have similar shapes. Whilst many packages for signature attribution exist, a few provide accuracy measures, and most are not easily reproducible nor scalable in high-performance computing environments. Results We present Mutational Signature Attribution (MSA), a reproducible pipeline designed to assign signatures of different mutation types on a single-sample basis, using Non-Negative Least Squares method with optimisation based on configurable simulations. Parametric bootstrap is proposed as a way to measure statistical uncertainties of signature attribution. Supported mutation types include single and doublet base substitutions, indels and structural variants. Results are validated using simulations with reference COSMIC signatures, as well as randomly generated signatures. Conclusions MSA is a tool for optimised mutational signature attribution based on simulations, providing confidence intervals using parametric bootstrap. It comprises a set of Python scripts unified in a single Nextflow pipeline with containerisation for cross-platform reproducibility and scalability in high-performance computing environments. The tool is publicly available from https://gitlab.com/s.senkin/MSA .Sergey SenkinBMCarticleMSAMutational signaturesNNLSParametric bootstrapNextflowComputer applications to medicine. Medical informaticsR858-859.7Biology (General)QH301-705.5ENBMC Bioinformatics, Vol 22, Iss 1, Pp 1-11 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
MSA Mutational signatures NNLS Parametric bootstrap Nextflow Computer applications to medicine. Medical informatics R858-859.7 Biology (General) QH301-705.5 |
spellingShingle |
MSA Mutational signatures NNLS Parametric bootstrap Nextflow Computer applications to medicine. Medical informatics R858-859.7 Biology (General) QH301-705.5 Sergey Senkin MSA: reproducible mutational signature attribution with confidence based on simulations |
description |
Abstract Background Mutational signatures proved to be a useful tool for identifying patterns of mutations in genomes, often providing valuable insights about mutagenic processes or normal DNA damage. De novo extraction of signatures is commonly performed using Non-Negative Matrix Factorisation methods, however, accurate attribution of these signatures to individual samples is a distinct problem requiring uncertainty estimation, particularly in noisy scenarios or when the acting signatures have similar shapes. Whilst many packages for signature attribution exist, a few provide accuracy measures, and most are not easily reproducible nor scalable in high-performance computing environments. Results We present Mutational Signature Attribution (MSA), a reproducible pipeline designed to assign signatures of different mutation types on a single-sample basis, using Non-Negative Least Squares method with optimisation based on configurable simulations. Parametric bootstrap is proposed as a way to measure statistical uncertainties of signature attribution. Supported mutation types include single and doublet base substitutions, indels and structural variants. Results are validated using simulations with reference COSMIC signatures, as well as randomly generated signatures. Conclusions MSA is a tool for optimised mutational signature attribution based on simulations, providing confidence intervals using parametric bootstrap. It comprises a set of Python scripts unified in a single Nextflow pipeline with containerisation for cross-platform reproducibility and scalability in high-performance computing environments. The tool is publicly available from https://gitlab.com/s.senkin/MSA . |
format |
article |
author |
Sergey Senkin |
author_facet |
Sergey Senkin |
author_sort |
Sergey Senkin |
title |
MSA: reproducible mutational signature attribution with confidence based on simulations |
title_short |
MSA: reproducible mutational signature attribution with confidence based on simulations |
title_full |
MSA: reproducible mutational signature attribution with confidence based on simulations |
title_fullStr |
MSA: reproducible mutational signature attribution with confidence based on simulations |
title_full_unstemmed |
MSA: reproducible mutational signature attribution with confidence based on simulations |
title_sort |
msa: reproducible mutational signature attribution with confidence based on simulations |
publisher |
BMC |
publishDate |
2021 |
url |
https://doaj.org/article/d101f8dfde724132b54e2266d9c6cf30 |
work_keys_str_mv |
AT sergeysenkin msareproduciblemutationalsignatureattributionwithconfidencebasedonsimulations |
_version_ |
1718443516945760256 |