Use of low-coverage, large-insert, short-read data for rapid and accurate generation of enhanced-quality draft Pseudomonas genome sequences.

Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of unif...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Heath E O'Brien, Yunchen Gong, Pauline Fung, Pauline W Wang, David S Guttman
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2011
Materias:
R
Q
Acceso en línea:https://doaj.org/article/ba3c07cf52464e579330fba08fbc600d
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:ba3c07cf52464e579330fba08fbc600d
record_format dspace
spelling oai:doaj.org-article:ba3c07cf52464e579330fba08fbc600d2021-11-18T07:35:06ZUse of low-coverage, large-insert, short-read data for rapid and accurate generation of enhanced-quality draft Pseudomonas genome sequences.1932-620310.1371/journal.pone.0027199https://doaj.org/article/ba3c07cf52464e579330fba08fbc600d2011-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/22073286/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of uniformity among published draft genomes, leading to challenges for downstream comparative analyses. This lack of uniformity is a particular problem when using standard draft genomes that frequently have large numbers of low-quality sequencing tracts. Here we present a proposal for an "enhanced-quality draft" genome that identifies at least 95% of the coding sequences, thereby effectively providing a full accounting of the genic component of the genome. Enhanced-quality draft genomes are easily attainable through a combination of small- and large-insert next-generation, paired-end sequencing. We illustrate the generation of an enhanced-quality draft genome by re-sequencing the plant pathogenic bacterium Pseudomonas syringae pv. phaseolicola 1448A (Pph 1448A), which has a published, closed genome sequence of 5.93 Mbp. We use a combination of Illumina paired-end and mate-pair sequencing, and surprisingly find that de novo assemblies with 100x paired-end coverage and mate-pair sequencing with as low as low as 2-5x coverage are substantially better than assemblies based on higher coverage. The rapid and low-cost generation of large numbers of enhanced-quality draft genome sequences will be of particular value for microbial diagnostics and biosecurity, which rely on precise discrimination of potentially dangerous clones from closely related benign strains.Heath E O'BrienYunchen GongPauline FungPauline W WangDavid S GuttmanPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 6, Iss 11, p e27199 (2011)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Heath E O'Brien
Yunchen Gong
Pauline Fung
Pauline W Wang
David S Guttman
Use of low-coverage, large-insert, short-read data for rapid and accurate generation of enhanced-quality draft Pseudomonas genome sequences.
description Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of uniformity among published draft genomes, leading to challenges for downstream comparative analyses. This lack of uniformity is a particular problem when using standard draft genomes that frequently have large numbers of low-quality sequencing tracts. Here we present a proposal for an "enhanced-quality draft" genome that identifies at least 95% of the coding sequences, thereby effectively providing a full accounting of the genic component of the genome. Enhanced-quality draft genomes are easily attainable through a combination of small- and large-insert next-generation, paired-end sequencing. We illustrate the generation of an enhanced-quality draft genome by re-sequencing the plant pathogenic bacterium Pseudomonas syringae pv. phaseolicola 1448A (Pph 1448A), which has a published, closed genome sequence of 5.93 Mbp. We use a combination of Illumina paired-end and mate-pair sequencing, and surprisingly find that de novo assemblies with 100x paired-end coverage and mate-pair sequencing with as low as low as 2-5x coverage are substantially better than assemblies based on higher coverage. The rapid and low-cost generation of large numbers of enhanced-quality draft genome sequences will be of particular value for microbial diagnostics and biosecurity, which rely on precise discrimination of potentially dangerous clones from closely related benign strains.
format article
author Heath E O'Brien
Yunchen Gong
Pauline Fung
Pauline W Wang
David S Guttman
author_facet Heath E O'Brien
Yunchen Gong
Pauline Fung
Pauline W Wang
David S Guttman
author_sort Heath E O'Brien
title Use of low-coverage, large-insert, short-read data for rapid and accurate generation of enhanced-quality draft Pseudomonas genome sequences.
title_short Use of low-coverage, large-insert, short-read data for rapid and accurate generation of enhanced-quality draft Pseudomonas genome sequences.
title_full Use of low-coverage, large-insert, short-read data for rapid and accurate generation of enhanced-quality draft Pseudomonas genome sequences.
title_fullStr Use of low-coverage, large-insert, short-read data for rapid and accurate generation of enhanced-quality draft Pseudomonas genome sequences.
title_full_unstemmed Use of low-coverage, large-insert, short-read data for rapid and accurate generation of enhanced-quality draft Pseudomonas genome sequences.
title_sort use of low-coverage, large-insert, short-read data for rapid and accurate generation of enhanced-quality draft pseudomonas genome sequences.
publisher Public Library of Science (PLoS)
publishDate 2011
url https://doaj.org/article/ba3c07cf52464e579330fba08fbc600d
work_keys_str_mv AT heatheobrien useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences
AT yunchengong useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences
AT paulinefung useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences
AT paulinewwang useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences
AT davidsguttman useoflowcoveragelargeinsertshortreaddataforrapidandaccurategenerationofenhancedqualitydraftpseudomonasgenomesequences
_version_ 1718423247315271680