DNA barcode sequence identification incorporating taxonomic hierarchy and within taxon variability.
For DNA barcoding to succeed as a scientific endeavor an accurate and expeditious query sequence identification method is needed. Although a global multiple-sequence alignment can be generated for some barcoding markers (e.g. COI, rbcL), not all barcoding markers are as structurally conserved (e.g....
Guardado en:
Autor principal: | |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2011
|
Materias: | |
Acceso en línea: | https://doaj.org/article/e8a0859390ce4d3abc4d264758f8befd |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:e8a0859390ce4d3abc4d264758f8befd |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:e8a0859390ce4d3abc4d264758f8befd2021-11-18T06:47:55ZDNA barcode sequence identification incorporating taxonomic hierarchy and within taxon variability.1932-620310.1371/journal.pone.0020552https://doaj.org/article/e8a0859390ce4d3abc4d264758f8befd2011-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21857897/?tool=EBIhttps://doaj.org/toc/1932-6203For DNA barcoding to succeed as a scientific endeavor an accurate and expeditious query sequence identification method is needed. Although a global multiple-sequence alignment can be generated for some barcoding markers (e.g. COI, rbcL), not all barcoding markers are as structurally conserved (e.g. matK). Thus, algorithms that depend on global multiple-sequence alignments are not universally applicable. Some sequence identification methods that use local pairwise alignments (e.g. BLAST) are unable to accurately differentiate between highly similar sequences and are not designed to cope with hierarchic phylogenetic relationships or within taxon variability. Here, I present a novel alignment-free sequence identification algorithm--BRONX--that accounts for observed within taxon variability and hierarchic relationships among taxa. BRONX identifies short variable segments and corresponding invariant flanking regions in reference sequences. These flanking regions are used to score variable regions in the query sequence without the production of a global multiple-sequence alignment. By incorporating observed within taxon variability into the scoring procedure, misidentifications arising from shared alleles/haplotypes are minimized. An explicit treatment of more inclusive terminals allows for separate identifications to be made for each taxonomic level and/or for user-defined terminals. BRONX performs better than all other methods when there is imperfect overlap between query and reference sequences (e.g. mini-barcode queries against a full-length barcode database). BRONX consistently produced better identifications at the genus-level for all query types.Damon P LittlePublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 6, Iss 8, p e20552 (2011) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Damon P Little DNA barcode sequence identification incorporating taxonomic hierarchy and within taxon variability. |
description |
For DNA barcoding to succeed as a scientific endeavor an accurate and expeditious query sequence identification method is needed. Although a global multiple-sequence alignment can be generated for some barcoding markers (e.g. COI, rbcL), not all barcoding markers are as structurally conserved (e.g. matK). Thus, algorithms that depend on global multiple-sequence alignments are not universally applicable. Some sequence identification methods that use local pairwise alignments (e.g. BLAST) are unable to accurately differentiate between highly similar sequences and are not designed to cope with hierarchic phylogenetic relationships or within taxon variability. Here, I present a novel alignment-free sequence identification algorithm--BRONX--that accounts for observed within taxon variability and hierarchic relationships among taxa. BRONX identifies short variable segments and corresponding invariant flanking regions in reference sequences. These flanking regions are used to score variable regions in the query sequence without the production of a global multiple-sequence alignment. By incorporating observed within taxon variability into the scoring procedure, misidentifications arising from shared alleles/haplotypes are minimized. An explicit treatment of more inclusive terminals allows for separate identifications to be made for each taxonomic level and/or for user-defined terminals. BRONX performs better than all other methods when there is imperfect overlap between query and reference sequences (e.g. mini-barcode queries against a full-length barcode database). BRONX consistently produced better identifications at the genus-level for all query types. |
format |
article |
author |
Damon P Little |
author_facet |
Damon P Little |
author_sort |
Damon P Little |
title |
DNA barcode sequence identification incorporating taxonomic hierarchy and within taxon variability. |
title_short |
DNA barcode sequence identification incorporating taxonomic hierarchy and within taxon variability. |
title_full |
DNA barcode sequence identification incorporating taxonomic hierarchy and within taxon variability. |
title_fullStr |
DNA barcode sequence identification incorporating taxonomic hierarchy and within taxon variability. |
title_full_unstemmed |
DNA barcode sequence identification incorporating taxonomic hierarchy and within taxon variability. |
title_sort |
dna barcode sequence identification incorporating taxonomic hierarchy and within taxon variability. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2011 |
url |
https://doaj.org/article/e8a0859390ce4d3abc4d264758f8befd |
work_keys_str_mv |
AT damonplittle dnabarcodesequenceidentificationincorporatingtaxonomichierarchyandwithintaxonvariability |
_version_ |
1718424388098850816 |