Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification.

Next generation sequencing (NGS) of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identificat...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Lőrinc S Pongor, Roberto Vera, Balázs Ligeti
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2014
Materias:
R
Q
Acceso en línea:https://doaj.org/article/61c68f04868048e1ae46cf9d57e56620
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:61c68f04868048e1ae46cf9d57e56620
record_format dspace
spelling oai:doaj.org-article:61c68f04868048e1ae46cf9d57e566202021-11-25T06:06:35ZFast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification.1932-620310.1371/journal.pone.0103441https://doaj.org/article/61c68f04868048e1ae46cf9d57e566202014-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/25077800/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203Next generation sequencing (NGS) of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identification is often inaccurate and low abundance pathogens can sometimes be missed. We have developed Taxoner, an open source, taxon assignment pipeline that includes a fast aligner (e.g. Bowtie2) and a comprehensive DNA sequence database. We tested the program on simulated datasets as well as experimental data from Illumina, IonTorrent, and Roche 454 sequencing platforms. We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers. Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database. In addition, it can be easily tuned to specific applications using small tailored databases. When applied to metagenomic datasets, Taxoner can provide a functional summary of the genes mapped and can provide strain level identification. Taxoner is written in C for Linux operating systems. The code and documentation are available for research applications at http://code.google.com/p/taxoner.Lőrinc S PongorRoberto VeraBalázs LigetiPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 9, Iss 7, p e103441 (2014)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Lőrinc S Pongor
Roberto Vera
Balázs Ligeti
Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification.
description Next generation sequencing (NGS) of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identification is often inaccurate and low abundance pathogens can sometimes be missed. We have developed Taxoner, an open source, taxon assignment pipeline that includes a fast aligner (e.g. Bowtie2) and a comprehensive DNA sequence database. We tested the program on simulated datasets as well as experimental data from Illumina, IonTorrent, and Roche 454 sequencing platforms. We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers. Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database. In addition, it can be easily tuned to specific applications using small tailored databases. When applied to metagenomic datasets, Taxoner can provide a functional summary of the genes mapped and can provide strain level identification. Taxoner is written in C for Linux operating systems. The code and documentation are available for research applications at http://code.google.com/p/taxoner.
format article
author Lőrinc S Pongor
Roberto Vera
Balázs Ligeti
author_facet Lőrinc S Pongor
Roberto Vera
Balázs Ligeti
author_sort Lőrinc S Pongor
title Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification.
title_short Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification.
title_full Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification.
title_fullStr Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification.
title_full_unstemmed Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification.
title_sort fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop pc: application to metagenomic datasets and pathogen identification.
publisher Public Library of Science (PLoS)
publishDate 2014
url https://doaj.org/article/61c68f04868048e1ae46cf9d57e56620
work_keys_str_mv AT lorincspongor fastandsensitivealignmentofmicrobialwholegenomesequencingreadstolargesequencedatasetsonadesktoppcapplicationtometagenomicdatasetsandpathogenidentification
AT robertovera fastandsensitivealignmentofmicrobialwholegenomesequencingreadstolargesequencedatasetsonadesktoppcapplicationtometagenomicdatasetsandpathogenidentification
AT balazsligeti fastandsensitivealignmentofmicrobialwholegenomesequencingreadstolargesequencedatasetsonadesktoppcapplicationtometagenomicdatasetsandpathogenidentification
_version_ 1718414165765259264