BAMClipper: removing primers from alignments to minimize false-negative mutations in amplicon next-generation sequencing

Abstract Amplicon-based next-generation sequencing (NGS) has been widely adopted for genetic variation detection in human and other organisms. Conventional data analysis paradigm includes primer trimming before read mapping. Here we introduce BAMClipper that removes primer sequences after mapping or...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Chun Hang Au, Dona N. Ho, Ava Kwong, Tsun Leung Chan, Edmond S. K. Ma
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2017
Materias:
R
Q
Acceso en línea:https://doaj.org/article/6e4c0ae171ba4d6baeb272e2a5123b23
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:6e4c0ae171ba4d6baeb272e2a5123b23
record_format dspace
spelling oai:doaj.org-article:6e4c0ae171ba4d6baeb272e2a5123b232021-12-02T11:41:00ZBAMClipper: removing primers from alignments to minimize false-negative mutations in amplicon next-generation sequencing10.1038/s41598-017-01703-62045-2322https://doaj.org/article/6e4c0ae171ba4d6baeb272e2a5123b232017-05-01T00:00:00Zhttps://doi.org/10.1038/s41598-017-01703-6https://doaj.org/toc/2045-2322Abstract Amplicon-based next-generation sequencing (NGS) has been widely adopted for genetic variation detection in human and other organisms. Conventional data analysis paradigm includes primer trimming before read mapping. Here we introduce BAMClipper that removes primer sequences after mapping original sequencing reads by soft-clipping SAM/BAM alignments. Mutation detection accuracy was affected by the choice of primer handling approach based on real NGS datasets of 7 human peripheral blood or breast cancer tissue samples with known BRCA1/BRCA2 mutations and >130000 simulated NGS datasets with unique mutations. BAMClipper approach detected a BRCA1 deletion (c.1620_1636del) that was otherwise missed due to edge effect. Simulation showed high false-negative rate when primers were perfectly trimmed as in conventional practice. Among the other 6 samples, variant allele frequencies of 5 BRCA1/BRCA2 mutations (indel or single-nucleotide variants) were diluted by apparently wild-type primer sequences from an overlapping amplicon (17 to 82% under-estimation). BAMClipper was robust in both situations and all 7 mutations were detected. When compared with Cutadapt, BAMClipper was faster and maintained equally high primer removal effectiveness. BAMClipper is implemented in Perl and is available under an open source MIT license at https://github.com/tommyau/bamclipper.Chun Hang AuDona N. HoAva KwongTsun Leung ChanEdmond S. K. MaNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 7, Iss 1, Pp 1-7 (2017)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Chun Hang Au
Dona N. Ho
Ava Kwong
Tsun Leung Chan
Edmond S. K. Ma
BAMClipper: removing primers from alignments to minimize false-negative mutations in amplicon next-generation sequencing
description Abstract Amplicon-based next-generation sequencing (NGS) has been widely adopted for genetic variation detection in human and other organisms. Conventional data analysis paradigm includes primer trimming before read mapping. Here we introduce BAMClipper that removes primer sequences after mapping original sequencing reads by soft-clipping SAM/BAM alignments. Mutation detection accuracy was affected by the choice of primer handling approach based on real NGS datasets of 7 human peripheral blood or breast cancer tissue samples with known BRCA1/BRCA2 mutations and >130000 simulated NGS datasets with unique mutations. BAMClipper approach detected a BRCA1 deletion (c.1620_1636del) that was otherwise missed due to edge effect. Simulation showed high false-negative rate when primers were perfectly trimmed as in conventional practice. Among the other 6 samples, variant allele frequencies of 5 BRCA1/BRCA2 mutations (indel or single-nucleotide variants) were diluted by apparently wild-type primer sequences from an overlapping amplicon (17 to 82% under-estimation). BAMClipper was robust in both situations and all 7 mutations were detected. When compared with Cutadapt, BAMClipper was faster and maintained equally high primer removal effectiveness. BAMClipper is implemented in Perl and is available under an open source MIT license at https://github.com/tommyau/bamclipper.
format article
author Chun Hang Au
Dona N. Ho
Ava Kwong
Tsun Leung Chan
Edmond S. K. Ma
author_facet Chun Hang Au
Dona N. Ho
Ava Kwong
Tsun Leung Chan
Edmond S. K. Ma
author_sort Chun Hang Au
title BAMClipper: removing primers from alignments to minimize false-negative mutations in amplicon next-generation sequencing
title_short BAMClipper: removing primers from alignments to minimize false-negative mutations in amplicon next-generation sequencing
title_full BAMClipper: removing primers from alignments to minimize false-negative mutations in amplicon next-generation sequencing
title_fullStr BAMClipper: removing primers from alignments to minimize false-negative mutations in amplicon next-generation sequencing
title_full_unstemmed BAMClipper: removing primers from alignments to minimize false-negative mutations in amplicon next-generation sequencing
title_sort bamclipper: removing primers from alignments to minimize false-negative mutations in amplicon next-generation sequencing
publisher Nature Portfolio
publishDate 2017
url https://doaj.org/article/6e4c0ae171ba4d6baeb272e2a5123b23
work_keys_str_mv AT chunhangau bamclipperremovingprimersfromalignmentstominimizefalsenegativemutationsinampliconnextgenerationsequencing
AT donanho bamclipperremovingprimersfromalignmentstominimizefalsenegativemutationsinampliconnextgenerationsequencing
AT avakwong bamclipperremovingprimersfromalignmentstominimizefalsenegativemutationsinampliconnextgenerationsequencing
AT tsunleungchan bamclipperremovingprimersfromalignmentstominimizefalsenegativemutationsinampliconnextgenerationsequencing
AT edmondskma bamclipperremovingprimersfromalignmentstominimizefalsenegativemutationsinampliconnextgenerationsequencing
_version_ 1718395466261987328