Exploring Pandora's box: potential and pitfalls of low coverage genome surveys for evolutionary biology.

High throughput sequencing technologies are revolutionizing genetic research. With this "rise of the machines", genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Florian Leese, Philipp Brand, Andrey Rozenberg, Christoph Mayer, Shobhit Agrawal, Johannes Dambach, Lars Dietz, Jana S Doemel, William P Goodall-Copstake, Christoph Held, Jennifer A Jackson, Kathrin P Lampert, Katrin Linse, Jan N Macher, Jennifer Nolzen, Michael J Raupach, Nicole T Rivera, Christoph D Schubart, Sebastian Striewski, Ralph Tollrian, Chester J Sands
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2012
Materias:
R
Q
Acceso en línea:https://doaj.org/article/e4fbd7d257b44d0395e95b9e28f54023
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:e4fbd7d257b44d0395e95b9e28f54023
record_format dspace
spelling oai:doaj.org-article:e4fbd7d257b44d0395e95b9e28f540232021-11-18T08:08:01ZExploring Pandora's box: potential and pitfalls of low coverage genome surveys for evolutionary biology.1932-620310.1371/journal.pone.0049202https://doaj.org/article/e4fbd7d257b44d0395e95b9e28f540232012-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23185309/?tool=EBIhttps://doaj.org/toc/1932-6203High throughput sequencing technologies are revolutionizing genetic research. With this "rise of the machines", genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored species to identify molecular markers or genomic regions of interest (e.g. micro- and minisatellites, mitochondrial and nuclear genes) by sequencing only a fraction of the genome. However, when using such datasets from non-model species, it is possible that DNA from non-target contaminant species such as bacteria, viruses, fungi, or other eukaryotic organisms may complicate the interpretation of the results. In this study we analysed 14 genomic pyrosequencing libraries of aquatic non-model taxa from four major evolutionary lineages. We quantified the amount of suitable micro- and minisatellites, mitochondrial genomes, known nuclear genes and transposable elements and searched for contamination from various sources using bioinformatic approaches. Our results show that in all sequence libraries with estimated coverage of about 0.02-25%, many appropriate micro- and minisatellites, mitochondrial gene sequences and nuclear genes from different KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways could be identified and characterized. These can serve as markers for phylogenetic and population genetic analyses. A central finding of our study is that several genomic libraries suffered from different biases owing to non-target DNA or mobile elements. In particular, viruses, bacteria or eukaryote endosymbionts contributed significantly (up to 10%) to some of the libraries analysed. If not identified as such, genetic markers developed from high-throughput sequencing data for non-model organisms may bias evolutionary studies or fail completely in experimental tests. In conclusion, our study demonstrates the enormous potential of low-coverage genome survey sequences and suggests bioinformatic analysis workflows. The results also advise a more sophisticated filtering for problematic sequences and non-target genome sequences prior to developing markers.Florian LeesePhilipp BrandAndrey RozenbergChristoph MayerShobhit AgrawalJohannes DambachLars DietzJana S DoemelWilliam P Goodall-CopstakeChristoph HeldJennifer A JacksonKathrin P LampertKatrin LinseJan N MacherJennifer NolzenMichael J RaupachNicole T RiveraChristoph D SchubartSebastian StriewskiRalph TollrianChester J SandsPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 7, Iss 11, p e49202 (2012)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Florian Leese
Philipp Brand
Andrey Rozenberg
Christoph Mayer
Shobhit Agrawal
Johannes Dambach
Lars Dietz
Jana S Doemel
William P Goodall-Copstake
Christoph Held
Jennifer A Jackson
Kathrin P Lampert
Katrin Linse
Jan N Macher
Jennifer Nolzen
Michael J Raupach
Nicole T Rivera
Christoph D Schubart
Sebastian Striewski
Ralph Tollrian
Chester J Sands
Exploring Pandora's box: potential and pitfalls of low coverage genome surveys for evolutionary biology.
description High throughput sequencing technologies are revolutionizing genetic research. With this "rise of the machines", genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored species to identify molecular markers or genomic regions of interest (e.g. micro- and minisatellites, mitochondrial and nuclear genes) by sequencing only a fraction of the genome. However, when using such datasets from non-model species, it is possible that DNA from non-target contaminant species such as bacteria, viruses, fungi, or other eukaryotic organisms may complicate the interpretation of the results. In this study we analysed 14 genomic pyrosequencing libraries of aquatic non-model taxa from four major evolutionary lineages. We quantified the amount of suitable micro- and minisatellites, mitochondrial genomes, known nuclear genes and transposable elements and searched for contamination from various sources using bioinformatic approaches. Our results show that in all sequence libraries with estimated coverage of about 0.02-25%, many appropriate micro- and minisatellites, mitochondrial gene sequences and nuclear genes from different KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways could be identified and characterized. These can serve as markers for phylogenetic and population genetic analyses. A central finding of our study is that several genomic libraries suffered from different biases owing to non-target DNA or mobile elements. In particular, viruses, bacteria or eukaryote endosymbionts contributed significantly (up to 10%) to some of the libraries analysed. If not identified as such, genetic markers developed from high-throughput sequencing data for non-model organisms may bias evolutionary studies or fail completely in experimental tests. In conclusion, our study demonstrates the enormous potential of low-coverage genome survey sequences and suggests bioinformatic analysis workflows. The results also advise a more sophisticated filtering for problematic sequences and non-target genome sequences prior to developing markers.
format article
author Florian Leese
Philipp Brand
Andrey Rozenberg
Christoph Mayer
Shobhit Agrawal
Johannes Dambach
Lars Dietz
Jana S Doemel
William P Goodall-Copstake
Christoph Held
Jennifer A Jackson
Kathrin P Lampert
Katrin Linse
Jan N Macher
Jennifer Nolzen
Michael J Raupach
Nicole T Rivera
Christoph D Schubart
Sebastian Striewski
Ralph Tollrian
Chester J Sands
author_facet Florian Leese
Philipp Brand
Andrey Rozenberg
Christoph Mayer
Shobhit Agrawal
Johannes Dambach
Lars Dietz
Jana S Doemel
William P Goodall-Copstake
Christoph Held
Jennifer A Jackson
Kathrin P Lampert
Katrin Linse
Jan N Macher
Jennifer Nolzen
Michael J Raupach
Nicole T Rivera
Christoph D Schubart
Sebastian Striewski
Ralph Tollrian
Chester J Sands
author_sort Florian Leese
title Exploring Pandora's box: potential and pitfalls of low coverage genome surveys for evolutionary biology.
title_short Exploring Pandora's box: potential and pitfalls of low coverage genome surveys for evolutionary biology.
title_full Exploring Pandora's box: potential and pitfalls of low coverage genome surveys for evolutionary biology.
title_fullStr Exploring Pandora's box: potential and pitfalls of low coverage genome surveys for evolutionary biology.
title_full_unstemmed Exploring Pandora's box: potential and pitfalls of low coverage genome surveys for evolutionary biology.
title_sort exploring pandora's box: potential and pitfalls of low coverage genome surveys for evolutionary biology.
publisher Public Library of Science (PLoS)
publishDate 2012
url https://doaj.org/article/e4fbd7d257b44d0395e95b9e28f54023
work_keys_str_mv AT florianleese exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT philippbrand exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT andreyrozenberg exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT christophmayer exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT shobhitagrawal exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT johannesdambach exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT larsdietz exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT janasdoemel exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT williampgoodallcopstake exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT christophheld exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT jenniferajackson exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT kathrinplampert exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT katrinlinse exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT jannmacher exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT jennifernolzen exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT michaeljraupach exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT nicoletrivera exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT christophdschubart exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT sebastianstriewski exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT ralphtollrian exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
AT chesterjsands exploringpandorasboxpotentialandpitfallsoflowcoveragegenomesurveysforevolutionarybiology
_version_ 1718422202180698112