CUSHAW3: sensitive and accurate base-space and color-space short-read alignment with hybrid seeding.

The majority of next-generation sequencing short-reads can be properly aligned by leading aligners at high speed. However, the alignment quality can still be further improved, since usually not all reads can be correctly aligned to large genomes, such as the human genome, even for simulated data. Mo...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Yongchao Liu, Bernt Popp, Bertil Schmidt
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2014
Materias:
R
Q
Acceso en línea:https://doaj.org/article/607ddbc33c7f428aaefb6bdec3db870f
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:607ddbc33c7f428aaefb6bdec3db870f
record_format dspace
spelling oai:doaj.org-article:607ddbc33c7f428aaefb6bdec3db870f2021-11-18T08:36:19ZCUSHAW3: sensitive and accurate base-space and color-space short-read alignment with hybrid seeding.1932-620310.1371/journal.pone.0086869https://doaj.org/article/607ddbc33c7f428aaefb6bdec3db870f2014-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24466273/?tool=EBIhttps://doaj.org/toc/1932-6203The majority of next-generation sequencing short-reads can be properly aligned by leading aligners at high speed. However, the alignment quality can still be further improved, since usually not all reads can be correctly aligned to large genomes, such as the human genome, even for simulated data. Moreover, even slight improvements in this area are important but challenging, and usually require significantly more computational endeavor. In this paper, we present CUSHAW3, an open-source parallelized, sensitive and accurate short-read aligner for both base-space and color-space sequences. In this aligner, we have investigated a hybrid seeding approach to improve alignment quality, which incorporates three different seed types, i.e. maximal exact match seeds, exact-match k-mer seeds and variable-length seeds, into the alignment pipeline. Furthermore, three techniques: weighted seed-pairing heuristic, paired-end alignment pair ranking and read mate rescuing have been conceived to facilitate accurate paired-end alignment. For base-space alignment, we have compared CUSHAW3 to Novoalign, CUSHAW2, BWA-MEM, Bowtie2 and GEM, by aligning both simulated and real reads to the human genome. The results show that CUSHAW3 consistently outperforms CUSHAW2, BWA-MEM, Bowtie2 and GEM in terms of single-end and paired-end alignment. Furthermore, our aligner has demonstrated better paired-end alignment performance than Novoalign for short-reads with high error rates. For color-space alignment, CUSHAW3 is consistently one of the best aligners compared to SHRiMP2 and BFAST. The source code of CUSHAW3 and all simulated data are available at http://cushaw3.sourceforge.net.Yongchao LiuBernt PoppBertil SchmidtPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 9, Iss 1, p e86869 (2014)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Yongchao Liu
Bernt Popp
Bertil Schmidt
CUSHAW3: sensitive and accurate base-space and color-space short-read alignment with hybrid seeding.
description The majority of next-generation sequencing short-reads can be properly aligned by leading aligners at high speed. However, the alignment quality can still be further improved, since usually not all reads can be correctly aligned to large genomes, such as the human genome, even for simulated data. Moreover, even slight improvements in this area are important but challenging, and usually require significantly more computational endeavor. In this paper, we present CUSHAW3, an open-source parallelized, sensitive and accurate short-read aligner for both base-space and color-space sequences. In this aligner, we have investigated a hybrid seeding approach to improve alignment quality, which incorporates three different seed types, i.e. maximal exact match seeds, exact-match k-mer seeds and variable-length seeds, into the alignment pipeline. Furthermore, three techniques: weighted seed-pairing heuristic, paired-end alignment pair ranking and read mate rescuing have been conceived to facilitate accurate paired-end alignment. For base-space alignment, we have compared CUSHAW3 to Novoalign, CUSHAW2, BWA-MEM, Bowtie2 and GEM, by aligning both simulated and real reads to the human genome. The results show that CUSHAW3 consistently outperforms CUSHAW2, BWA-MEM, Bowtie2 and GEM in terms of single-end and paired-end alignment. Furthermore, our aligner has demonstrated better paired-end alignment performance than Novoalign for short-reads with high error rates. For color-space alignment, CUSHAW3 is consistently one of the best aligners compared to SHRiMP2 and BFAST. The source code of CUSHAW3 and all simulated data are available at http://cushaw3.sourceforge.net.
format article
author Yongchao Liu
Bernt Popp
Bertil Schmidt
author_facet Yongchao Liu
Bernt Popp
Bertil Schmidt
author_sort Yongchao Liu
title CUSHAW3: sensitive and accurate base-space and color-space short-read alignment with hybrid seeding.
title_short CUSHAW3: sensitive and accurate base-space and color-space short-read alignment with hybrid seeding.
title_full CUSHAW3: sensitive and accurate base-space and color-space short-read alignment with hybrid seeding.
title_fullStr CUSHAW3: sensitive and accurate base-space and color-space short-read alignment with hybrid seeding.
title_full_unstemmed CUSHAW3: sensitive and accurate base-space and color-space short-read alignment with hybrid seeding.
title_sort cushaw3: sensitive and accurate base-space and color-space short-read alignment with hybrid seeding.
publisher Public Library of Science (PLoS)
publishDate 2014
url https://doaj.org/article/607ddbc33c7f428aaefb6bdec3db870f
work_keys_str_mv AT yongchaoliu cushaw3sensitiveandaccuratebasespaceandcolorspaceshortreadalignmentwithhybridseeding
AT berntpopp cushaw3sensitiveandaccuratebasespaceandcolorspaceshortreadalignmentwithhybridseeding
AT bertilschmidt cushaw3sensitiveandaccuratebasespaceandcolorspaceshortreadalignmentwithhybridseeding
_version_ 1718421562643709952