Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons.

Next generation sequencing of viral populations has advanced our understanding of viral population dynamics, the development of drug resistance, and escape from host immune responses. Many applications require complete gene sequences, which can be impossible to reconstruct from short reads. HIV env,...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Kemal Eren, Steven Weaver, Robert Ketteringham, Morné Valentyn, Melissa Laird Smith, Venkatesh Kumar, Sanjay Mohan, Sergei L Kosakovsky Pond, Ben Murrell
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2018
Materias:
Acceso en línea:https://doaj.org/article/f6845c2ab3904b2da2c22b7046705531
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:f6845c2ab3904b2da2c22b7046705531
record_format dspace
spelling oai:doaj.org-article:f6845c2ab3904b2da2c22b70467055312021-12-02T19:57:35ZFull-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons.1553-734X1553-735810.1371/journal.pcbi.1006498https://doaj.org/article/f6845c2ab3904b2da2c22b70467055312018-12-01T00:00:00Zhttps://doi.org/10.1371/journal.pcbi.1006498https://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Next generation sequencing of viral populations has advanced our understanding of viral population dynamics, the development of drug resistance, and escape from host immune responses. Many applications require complete gene sequences, which can be impossible to reconstruct from short reads. HIV env, the protein of interest for HIV vaccine studies, is exceptionally challenging for long-read sequencing and analysis due to its length, high substitution rate, and extensive indel variation. While long-read sequencing is attractive in this setting, the analysis of such data is not well handled by existing methods. To address this, we introduce FLEA (Full-Length Envelope Analyzer), which performs end-to-end analysis and visualization of long-read sequencing data. FLEA consists of both a pipeline (optionally run on a high-performance cluster), and a client-side web application that provides interactive results. The pipeline transforms FASTQ reads into high-quality consensus sequences (HQCSs) and uses them to build a codon-aware multiple sequence alignment. The resulting alignment is then used to infer phylogenies, selection pressure, and evolutionary dynamics. The web application provides publication-quality plots and interactive visualizations, including an annotated viral alignment browser, time series plots of evolutionary dynamics, visualizations of gene-wide selective pressures (such as dN/dS) across time and across protein structure, and a phylogenetic tree browser. We demonstrate how FLEA may be used to process Pacific Biosciences HIV env data and describe recent examples of its use. Simulations show how FLEA dramatically reduces the error rate of this sequencing platform, providing an accurate portrait of complex and variable HIV env populations. A public instance of FLEA is hosted at http://flea.datamonkey.org. The Python source code for the FLEA pipeline can be found at https://github.com/veg/flea-pipeline. The client-side application is available at https://github.com/veg/flea-web-app. A live demo of the P018 results can be found at http://flea.murrell.group/view/P018.Kemal ErenSteven WeaverRobert KetteringhamMorné ValentynMelissa Laird SmithVenkatesh KumarSanjay MohanSergei L Kosakovsky PondBen MurrellPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 14, Iss 12, p e1006498 (2018)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Kemal Eren
Steven Weaver
Robert Ketteringham
Morné Valentyn
Melissa Laird Smith
Venkatesh Kumar
Sanjay Mohan
Sergei L Kosakovsky Pond
Ben Murrell
Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons.
description Next generation sequencing of viral populations has advanced our understanding of viral population dynamics, the development of drug resistance, and escape from host immune responses. Many applications require complete gene sequences, which can be impossible to reconstruct from short reads. HIV env, the protein of interest for HIV vaccine studies, is exceptionally challenging for long-read sequencing and analysis due to its length, high substitution rate, and extensive indel variation. While long-read sequencing is attractive in this setting, the analysis of such data is not well handled by existing methods. To address this, we introduce FLEA (Full-Length Envelope Analyzer), which performs end-to-end analysis and visualization of long-read sequencing data. FLEA consists of both a pipeline (optionally run on a high-performance cluster), and a client-side web application that provides interactive results. The pipeline transforms FASTQ reads into high-quality consensus sequences (HQCSs) and uses them to build a codon-aware multiple sequence alignment. The resulting alignment is then used to infer phylogenies, selection pressure, and evolutionary dynamics. The web application provides publication-quality plots and interactive visualizations, including an annotated viral alignment browser, time series plots of evolutionary dynamics, visualizations of gene-wide selective pressures (such as dN/dS) across time and across protein structure, and a phylogenetic tree browser. We demonstrate how FLEA may be used to process Pacific Biosciences HIV env data and describe recent examples of its use. Simulations show how FLEA dramatically reduces the error rate of this sequencing platform, providing an accurate portrait of complex and variable HIV env populations. A public instance of FLEA is hosted at http://flea.datamonkey.org. The Python source code for the FLEA pipeline can be found at https://github.com/veg/flea-pipeline. The client-side application is available at https://github.com/veg/flea-web-app. A live demo of the P018 results can be found at http://flea.murrell.group/view/P018.
format article
author Kemal Eren
Steven Weaver
Robert Ketteringham
Morné Valentyn
Melissa Laird Smith
Venkatesh Kumar
Sanjay Mohan
Sergei L Kosakovsky Pond
Ben Murrell
author_facet Kemal Eren
Steven Weaver
Robert Ketteringham
Morné Valentyn
Melissa Laird Smith
Venkatesh Kumar
Sanjay Mohan
Sergei L Kosakovsky Pond
Ben Murrell
author_sort Kemal Eren
title Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons.
title_short Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons.
title_full Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons.
title_fullStr Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons.
title_full_unstemmed Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons.
title_sort full-length envelope analyzer (flea): a tool for longitudinal analysis of viral amplicons.
publisher Public Library of Science (PLoS)
publishDate 2018
url https://doaj.org/article/f6845c2ab3904b2da2c22b7046705531
work_keys_str_mv AT kemaleren fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT stevenweaver fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT robertketteringham fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT mornevalentyn fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT melissalairdsmith fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT venkateshkumar fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT sanjaymohan fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT sergeilkosakovskypond fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT benmurrell fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
_version_ 1718375815923630080