OpenCyto: an open source infrastructure for scalable, robust, reproducible, and automated, end-to-end flow cytometry data analysis.

Flow cytometry is used increasingly in clinical research for cancer, immunology and vaccines. Technological advances in cytometry instrumentation are increasing the size and dimensionality of data sets, posing a challenge for traditional data management and analysis. Automated analysis methods, desp...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Greg Finak, Jacob Frelinger, Wenxin Jiang, Evan W Newell, John Ramey, Mark M Davis, Spyros A Kalams, Stephen C De Rosa, Raphael Gottardo
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2014
Materias:
Acceso en línea:https://doaj.org/article/8bde3b2c0d8a49e7b2ac8c80a86ebb07
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:8bde3b2c0d8a49e7b2ac8c80a86ebb07
record_format dspace
spelling oai:doaj.org-article:8bde3b2c0d8a49e7b2ac8c80a86ebb072021-11-25T05:40:48ZOpenCyto: an open source infrastructure for scalable, robust, reproducible, and automated, end-to-end flow cytometry data analysis.1553-734X1553-735810.1371/journal.pcbi.1003806https://doaj.org/article/8bde3b2c0d8a49e7b2ac8c80a86ebb072014-08-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/25167361/?tool=EBIhttps://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Flow cytometry is used increasingly in clinical research for cancer, immunology and vaccines. Technological advances in cytometry instrumentation are increasing the size and dimensionality of data sets, posing a challenge for traditional data management and analysis. Automated analysis methods, despite a general consensus of their importance to the future of the field, have been slow to gain widespread adoption. Here we present OpenCyto, a new BioConductor infrastructure and data analysis framework designed to lower the barrier of entry to automated flow data analysis algorithms by addressing key areas that we believe have held back wider adoption of automated approaches. OpenCyto supports end-to-end data analysis that is robust and reproducible while generating results that are easy to interpret. We have improved the existing, widely used core BioConductor flow cytometry infrastructure by allowing analysis to scale in a memory efficient manner to the large flow data sets that arise in clinical trials, and integrating domain-specific knowledge as part of the pipeline through the hierarchical relationships among cell populations. Pipelines are defined through a text-based csv file, limiting the need to write data-specific code, and are data agnostic to simplify repetitive analysis for core facilities. We demonstrate how to analyze two large cytometry data sets: an intracellular cytokine staining (ICS) data set from a published HIV vaccine trial focused on detecting rare, antigen-specific T-cell populations, where we identify a new subset of CD8 T-cells with a vaccine-regimen specific response that could not be identified through manual analysis, and a CyTOF T-cell phenotyping data set where a large staining panel and many cell populations are a challenge for traditional analysis. The substantial improvements to the core BioConductor flow cytometry packages give OpenCyto the potential for wide adoption. It can rapidly leverage new developments in computational cytometry and facilitate reproducible analysis in a unified environment.Greg FinakJacob FrelingerWenxin JiangEvan W NewellJohn RameyMark M DavisSpyros A KalamsStephen C De RosaRaphael GottardoPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 10, Iss 8, p e1003806 (2014)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Greg Finak
Jacob Frelinger
Wenxin Jiang
Evan W Newell
John Ramey
Mark M Davis
Spyros A Kalams
Stephen C De Rosa
Raphael Gottardo
OpenCyto: an open source infrastructure for scalable, robust, reproducible, and automated, end-to-end flow cytometry data analysis.
description Flow cytometry is used increasingly in clinical research for cancer, immunology and vaccines. Technological advances in cytometry instrumentation are increasing the size and dimensionality of data sets, posing a challenge for traditional data management and analysis. Automated analysis methods, despite a general consensus of their importance to the future of the field, have been slow to gain widespread adoption. Here we present OpenCyto, a new BioConductor infrastructure and data analysis framework designed to lower the barrier of entry to automated flow data analysis algorithms by addressing key areas that we believe have held back wider adoption of automated approaches. OpenCyto supports end-to-end data analysis that is robust and reproducible while generating results that are easy to interpret. We have improved the existing, widely used core BioConductor flow cytometry infrastructure by allowing analysis to scale in a memory efficient manner to the large flow data sets that arise in clinical trials, and integrating domain-specific knowledge as part of the pipeline through the hierarchical relationships among cell populations. Pipelines are defined through a text-based csv file, limiting the need to write data-specific code, and are data agnostic to simplify repetitive analysis for core facilities. We demonstrate how to analyze two large cytometry data sets: an intracellular cytokine staining (ICS) data set from a published HIV vaccine trial focused on detecting rare, antigen-specific T-cell populations, where we identify a new subset of CD8 T-cells with a vaccine-regimen specific response that could not be identified through manual analysis, and a CyTOF T-cell phenotyping data set where a large staining panel and many cell populations are a challenge for traditional analysis. The substantial improvements to the core BioConductor flow cytometry packages give OpenCyto the potential for wide adoption. It can rapidly leverage new developments in computational cytometry and facilitate reproducible analysis in a unified environment.
format article
author Greg Finak
Jacob Frelinger
Wenxin Jiang
Evan W Newell
John Ramey
Mark M Davis
Spyros A Kalams
Stephen C De Rosa
Raphael Gottardo
author_facet Greg Finak
Jacob Frelinger
Wenxin Jiang
Evan W Newell
John Ramey
Mark M Davis
Spyros A Kalams
Stephen C De Rosa
Raphael Gottardo
author_sort Greg Finak
title OpenCyto: an open source infrastructure for scalable, robust, reproducible, and automated, end-to-end flow cytometry data analysis.
title_short OpenCyto: an open source infrastructure for scalable, robust, reproducible, and automated, end-to-end flow cytometry data analysis.
title_full OpenCyto: an open source infrastructure for scalable, robust, reproducible, and automated, end-to-end flow cytometry data analysis.
title_fullStr OpenCyto: an open source infrastructure for scalable, robust, reproducible, and automated, end-to-end flow cytometry data analysis.
title_full_unstemmed OpenCyto: an open source infrastructure for scalable, robust, reproducible, and automated, end-to-end flow cytometry data analysis.
title_sort opencyto: an open source infrastructure for scalable, robust, reproducible, and automated, end-to-end flow cytometry data analysis.
publisher Public Library of Science (PLoS)
publishDate 2014
url https://doaj.org/article/8bde3b2c0d8a49e7b2ac8c80a86ebb07
work_keys_str_mv AT gregfinak opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT jacobfrelinger opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT wenxinjiang opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT evanwnewell opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT johnramey opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT markmdavis opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT spyrosakalams opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT stephencderosa opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
AT raphaelgottardo opencytoanopensourceinfrastructureforscalablerobustreproducibleandautomatedendtoendflowcytometrydataanalysis
_version_ 1718414528837844992