Occupancy modeling, maximum contig size probabilities and designing metagenomics experiments.

Mathematical aspects of coverage and gaps in genome assembly have received substantial attention by bioinformaticians. Typical problems under consideration suppose that reads can be experimentally obtained from a single genome and that the number of reads will be set to cover a large percentage of t...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autor principal: Stephen A Stanhope
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2010
Materias:
R
Q
Acceso en línea:https://doaj.org/article/a8ab48e0a9c74e4a975ef90196901b5d
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:a8ab48e0a9c74e4a975ef90196901b5d
record_format dspace
spelling oai:doaj.org-article:a8ab48e0a9c74e4a975ef90196901b5d2021-11-18T06:36:37ZOccupancy modeling, maximum contig size probabilities and designing metagenomics experiments.1932-620310.1371/journal.pone.0011652https://doaj.org/article/a8ab48e0a9c74e4a975ef90196901b5d2010-07-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/20686599/?tool=EBIhttps://doaj.org/toc/1932-6203Mathematical aspects of coverage and gaps in genome assembly have received substantial attention by bioinformaticians. Typical problems under consideration suppose that reads can be experimentally obtained from a single genome and that the number of reads will be set to cover a large percentage of that genome at a desired depth. In metagenomics experiments genomes from multiple species are simultaneously analyzed and obtaining large numbers of reads per genome is unlikely. We propose the probability of obtaining at least one contig of a desired minimum size from each novel genome in the pool without restriction based on depth of coverage as a metric for metagenomic experimental design. We derive an approximation to the distribution of maximum contig size for single genome assemblies using relatively few reads. This approximation is verified in simulation studies and applied to a number of different metagenomic experimental design problems, ranging in difficulty from detecting a single novel genome in a pool of known species to detecting each of a random number of novel genomes collectively sized and with abundances corresponding to given distributions in a single pool.Stephen A StanhopePublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 5, Iss 7, p e11652 (2010)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Stephen A Stanhope
Occupancy modeling, maximum contig size probabilities and designing metagenomics experiments.
description Mathematical aspects of coverage and gaps in genome assembly have received substantial attention by bioinformaticians. Typical problems under consideration suppose that reads can be experimentally obtained from a single genome and that the number of reads will be set to cover a large percentage of that genome at a desired depth. In metagenomics experiments genomes from multiple species are simultaneously analyzed and obtaining large numbers of reads per genome is unlikely. We propose the probability of obtaining at least one contig of a desired minimum size from each novel genome in the pool without restriction based on depth of coverage as a metric for metagenomic experimental design. We derive an approximation to the distribution of maximum contig size for single genome assemblies using relatively few reads. This approximation is verified in simulation studies and applied to a number of different metagenomic experimental design problems, ranging in difficulty from detecting a single novel genome in a pool of known species to detecting each of a random number of novel genomes collectively sized and with abundances corresponding to given distributions in a single pool.
format article
author Stephen A Stanhope
author_facet Stephen A Stanhope
author_sort Stephen A Stanhope
title Occupancy modeling, maximum contig size probabilities and designing metagenomics experiments.
title_short Occupancy modeling, maximum contig size probabilities and designing metagenomics experiments.
title_full Occupancy modeling, maximum contig size probabilities and designing metagenomics experiments.
title_fullStr Occupancy modeling, maximum contig size probabilities and designing metagenomics experiments.
title_full_unstemmed Occupancy modeling, maximum contig size probabilities and designing metagenomics experiments.
title_sort occupancy modeling, maximum contig size probabilities and designing metagenomics experiments.
publisher Public Library of Science (PLoS)
publishDate 2010
url https://doaj.org/article/a8ab48e0a9c74e4a975ef90196901b5d
work_keys_str_mv AT stephenastanhope occupancymodelingmaximumcontigsizeprobabilitiesanddesigningmetagenomicsexperiments
_version_ 1718424434997460992