Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data.
Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The sta...
Guardado en:
Autores principales: | , , , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2011
|
Materias: | |
Acceso en línea: | https://doaj.org/article/557b1a0a846f4972a5cd394478af23a7 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:557b1a0a846f4972a5cd394478af23a7 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:557b1a0a846f4972a5cd394478af23a72021-11-18T05:50:25ZDiscovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data.1553-734X1553-735810.1371/journal.pcbi.1002111https://doaj.org/article/557b1a0a846f4972a5cd394478af23a72011-07-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21779159/pdf/?tool=EBIhttps://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments.Dongjun ChungPei Fen KuanBo LiRajendran SanalkumarKun LiangEmery H BresnickColin DeweySündüz KeleşPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 7, Iss 7, p e1002111 (2011) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Biology (General) QH301-705.5 |
spellingShingle |
Biology (General) QH301-705.5 Dongjun Chung Pei Fen Kuan Bo Li Rajendran Sanalkumar Kun Liang Emery H Bresnick Colin Dewey Sündüz Keleş Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data. |
description |
Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments. |
format |
article |
author |
Dongjun Chung Pei Fen Kuan Bo Li Rajendran Sanalkumar Kun Liang Emery H Bresnick Colin Dewey Sündüz Keleş |
author_facet |
Dongjun Chung Pei Fen Kuan Bo Li Rajendran Sanalkumar Kun Liang Emery H Bresnick Colin Dewey Sündüz Keleş |
author_sort |
Dongjun Chung |
title |
Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data. |
title_short |
Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data. |
title_full |
Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data. |
title_fullStr |
Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data. |
title_full_unstemmed |
Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data. |
title_sort |
discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of chip-seq data. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2011 |
url |
https://doaj.org/article/557b1a0a846f4972a5cd394478af23a7 |
work_keys_str_mv |
AT dongjunchung discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT peifenkuan discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT boli discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT rajendransanalkumar discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT kunliang discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT emeryhbresnick discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT colindewey discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata AT sunduzkeles discoveringtranscriptionfactorbindingsitesinhighlyrepetitiveregionsofgenomeswithmultireadanalysisofchipseqdata |
_version_ |
1718424785557389312 |