A comparison of peak callers used for DNase-Seq data.

Genome-wide profiling of open chromatin regions using DNase I and high-throughput sequencing (DNase-seq) is an increasingly popular approach for finding and studying regulatory elements. A variety of algorithms have been developed to identify regions of open chromatin from raw sequence-tag data, whi...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Hashem Koohy, Thomas A Down, Mikhail Spivakov, Tim Hubbard
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2014
Materias:
R
Q
Acceso en línea:https://doaj.org/article/72529f33aa2046fc876b6f845a6a1978
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:72529f33aa2046fc876b6f845a6a1978
record_format dspace
spelling oai:doaj.org-article:72529f33aa2046fc876b6f845a6a19782021-11-18T08:20:11ZA comparison of peak callers used for DNase-Seq data.1932-620310.1371/journal.pone.0096303https://doaj.org/article/72529f33aa2046fc876b6f845a6a19782014-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24810143/?tool=EBIhttps://doaj.org/toc/1932-6203Genome-wide profiling of open chromatin regions using DNase I and high-throughput sequencing (DNase-seq) is an increasingly popular approach for finding and studying regulatory elements. A variety of algorithms have been developed to identify regions of open chromatin from raw sequence-tag data, which has motivated us to assess and compare their performance. In this study, four published, publicly available peak calling algorithms used for DNase-seq data analysis (F-seq, Hotspot, MACS and ZINBA) are assessed at a range of signal thresholds on two published DNase-seq datasets for three cell types. The results were benchmarked against an independent dataset of regulatory regions derived from ENCODE in vivo transcription factor binding data for each particular cell type. The level of overlap between peak regions reported by each algorithm and this ENCODE-derived reference set was used to assess sensitivity and specificity of the algorithms. Our study suggests that F-seq has a slightly higher sensitivity than the next best algorithms. Hotspot and the ChIP-seq oriented method, MACS, both perform competitively when used with their default parameters. However the generic peak finder ZINBA appears to be less sensitive than the other three. We also assess accuracy of each algorithm over a range of signal thresholds. In particular, we show that the accuracy of F-Seq can be considerably improved by using a threshold setting that is different from the default value.Hashem KoohyThomas A DownMikhail SpivakovTim HubbardPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 9, Iss 5, p e96303 (2014)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Hashem Koohy
Thomas A Down
Mikhail Spivakov
Tim Hubbard
A comparison of peak callers used for DNase-Seq data.
description Genome-wide profiling of open chromatin regions using DNase I and high-throughput sequencing (DNase-seq) is an increasingly popular approach for finding and studying regulatory elements. A variety of algorithms have been developed to identify regions of open chromatin from raw sequence-tag data, which has motivated us to assess and compare their performance. In this study, four published, publicly available peak calling algorithms used for DNase-seq data analysis (F-seq, Hotspot, MACS and ZINBA) are assessed at a range of signal thresholds on two published DNase-seq datasets for three cell types. The results were benchmarked against an independent dataset of regulatory regions derived from ENCODE in vivo transcription factor binding data for each particular cell type. The level of overlap between peak regions reported by each algorithm and this ENCODE-derived reference set was used to assess sensitivity and specificity of the algorithms. Our study suggests that F-seq has a slightly higher sensitivity than the next best algorithms. Hotspot and the ChIP-seq oriented method, MACS, both perform competitively when used with their default parameters. However the generic peak finder ZINBA appears to be less sensitive than the other three. We also assess accuracy of each algorithm over a range of signal thresholds. In particular, we show that the accuracy of F-Seq can be considerably improved by using a threshold setting that is different from the default value.
format article
author Hashem Koohy
Thomas A Down
Mikhail Spivakov
Tim Hubbard
author_facet Hashem Koohy
Thomas A Down
Mikhail Spivakov
Tim Hubbard
author_sort Hashem Koohy
title A comparison of peak callers used for DNase-Seq data.
title_short A comparison of peak callers used for DNase-Seq data.
title_full A comparison of peak callers used for DNase-Seq data.
title_fullStr A comparison of peak callers used for DNase-Seq data.
title_full_unstemmed A comparison of peak callers used for DNase-Seq data.
title_sort comparison of peak callers used for dnase-seq data.
publisher Public Library of Science (PLoS)
publishDate 2014
url https://doaj.org/article/72529f33aa2046fc876b6f845a6a1978
work_keys_str_mv AT hashemkoohy acomparisonofpeakcallersusedfordnaseseqdata
AT thomasadown acomparisonofpeakcallersusedfordnaseseqdata
AT mikhailspivakov acomparisonofpeakcallersusedfordnaseseqdata
AT timhubbard acomparisonofpeakcallersusedfordnaseseqdata
AT hashemkoohy comparisonofpeakcallersusedfordnaseseqdata
AT thomasadown comparisonofpeakcallersusedfordnaseseqdata
AT mikhailspivakov comparisonofpeakcallersusedfordnaseseqdata
AT timhubbard comparisonofpeakcallersusedfordnaseseqdata
_version_ 1718421929342271488