Evaluation of serverless computing for scalable execution of a joint variant calling workflow.
Advances in whole-genome sequencing have greatly reduced the cost and time of obtaining raw genetic information, but the computational requirements of analysis remain a challenge. Serverless computing has emerged as an alternative to using dedicated compute resources, but its utility has not been wi...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/2b72b067a19a473b82e50d3191c4398e |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:2b72b067a19a473b82e50d3191c4398e |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:2b72b067a19a473b82e50d3191c4398e2021-12-02T20:15:31ZEvaluation of serverless computing for scalable execution of a joint variant calling workflow.1932-620310.1371/journal.pone.0254363https://doaj.org/article/2b72b067a19a473b82e50d3191c4398e2021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0254363https://doaj.org/toc/1932-6203Advances in whole-genome sequencing have greatly reduced the cost and time of obtaining raw genetic information, but the computational requirements of analysis remain a challenge. Serverless computing has emerged as an alternative to using dedicated compute resources, but its utility has not been widely evaluated for standardized genomic workflows. In this study, we define and execute a best-practice joint variant calling workflow using the SWEEP workflow management system. We present an analysis of performance and scalability, and discuss the utility of the serverless paradigm for executing workflows in the field of genomics research. The GATK best-practice short germline joint variant calling pipeline was implemented as a SWEEP workflow comprising 18 tasks. The workflow was executed on Illumina paired-end read samples from the European and African super populations of the 1000 Genomes project phase III. Cost and runtime increased linearly with increasing sample size, although runtime was driven primarily by a single task for larger problem sizes. Execution took a minimum of around 3 hours for 2 samples, up to nearly 13 hours for 62 samples, with costs ranging from $2 to $70.Aji JohnKathleen MuenzenKristiina AusmeesPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 7, p e0254363 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Aji John Kathleen Muenzen Kristiina Ausmees Evaluation of serverless computing for scalable execution of a joint variant calling workflow. |
description |
Advances in whole-genome sequencing have greatly reduced the cost and time of obtaining raw genetic information, but the computational requirements of analysis remain a challenge. Serverless computing has emerged as an alternative to using dedicated compute resources, but its utility has not been widely evaluated for standardized genomic workflows. In this study, we define and execute a best-practice joint variant calling workflow using the SWEEP workflow management system. We present an analysis of performance and scalability, and discuss the utility of the serverless paradigm for executing workflows in the field of genomics research. The GATK best-practice short germline joint variant calling pipeline was implemented as a SWEEP workflow comprising 18 tasks. The workflow was executed on Illumina paired-end read samples from the European and African super populations of the 1000 Genomes project phase III. Cost and runtime increased linearly with increasing sample size, although runtime was driven primarily by a single task for larger problem sizes. Execution took a minimum of around 3 hours for 2 samples, up to nearly 13 hours for 62 samples, with costs ranging from $2 to $70. |
format |
article |
author |
Aji John Kathleen Muenzen Kristiina Ausmees |
author_facet |
Aji John Kathleen Muenzen Kristiina Ausmees |
author_sort |
Aji John |
title |
Evaluation of serverless computing for scalable execution of a joint variant calling workflow. |
title_short |
Evaluation of serverless computing for scalable execution of a joint variant calling workflow. |
title_full |
Evaluation of serverless computing for scalable execution of a joint variant calling workflow. |
title_fullStr |
Evaluation of serverless computing for scalable execution of a joint variant calling workflow. |
title_full_unstemmed |
Evaluation of serverless computing for scalable execution of a joint variant calling workflow. |
title_sort |
evaluation of serverless computing for scalable execution of a joint variant calling workflow. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2021 |
url |
https://doaj.org/article/2b72b067a19a473b82e50d3191c4398e |
work_keys_str_mv |
AT ajijohn evaluationofserverlesscomputingforscalableexecutionofajointvariantcallingworkflow AT kathleenmuenzen evaluationofserverlesscomputingforscalableexecutionofajointvariantcallingworkflow AT kristiinaausmees evaluationofserverlesscomputingforscalableexecutionofajointvariantcallingworkflow |
_version_ |
1718374572341854208 |