DeepG4: A deep learning approach to predict cell-type specific active G-quadruplex regions.

DNA is a complex molecule carrying the instructions an organism needs to develop, live and reproduce. In 1953, Watson and Crick discovered that DNA is composed of two chains forming a double-helix. Later on, other structures of DNA were discovered and shown to play important roles in the cell, in pa...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Vincent Rocher, Matthieu Genais, Elissar Nassereddine, Raphael Mourad
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
Acceso en línea:https://doaj.org/article/fa6917e498234847a88643103b214c39
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:fa6917e498234847a88643103b214c39
record_format dspace
spelling oai:doaj.org-article:fa6917e498234847a88643103b214c392021-12-02T19:58:05ZDeepG4: A deep learning approach to predict cell-type specific active G-quadruplex regions.1553-734X1553-735810.1371/journal.pcbi.1009308https://doaj.org/article/fa6917e498234847a88643103b214c392021-08-01T00:00:00Zhttps://doi.org/10.1371/journal.pcbi.1009308https://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358DNA is a complex molecule carrying the instructions an organism needs to develop, live and reproduce. In 1953, Watson and Crick discovered that DNA is composed of two chains forming a double-helix. Later on, other structures of DNA were discovered and shown to play important roles in the cell, in particular G-quadruplex (G4). Following genome sequencing, several bioinformatic algorithms were developed to map G4s in vitro based on a canonical sequence motif, G-richness and G-skewness or alternatively sequence features including k-mers, and more recently machine/deep learning. Recently, new sequencing techniques were developed to map G4s in vitro (G4-seq) and G4s in vivo (G4 ChIP-seq) at few hundred base resolution. Here, we propose a novel convolutional neural network (DeepG4) to map cell-type specific active G4 regions (e.g. regions within which G4s form both in vitro and in vivo). DeepG4 is very accurate to predict active G4 regions in different cell types. Moreover, DeepG4 identifies key DNA motifs that are predictive of G4 region activity. We found that such motifs do not follow a very flexible sequence pattern as current algorithms seek for. Instead, active G4 regions are determined by numerous specific motifs. Moreover, among those motifs, we identified known transcription factors (TFs) which could play important roles in G4 activity by contributing either directly to G4 structures themselves or indirectly by participating in G4 formation in the vicinity. In addition, we used DeepG4 to predict active G4 regions in a large number of tissues and cancers, thereby providing a comprehensive resource for researchers. Availability: https://github.com/morphos30/DeepG4.Vincent RocherMatthieu GenaisElissar NassereddineRaphael MouradPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 17, Iss 8, p e1009308 (2021)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Vincent Rocher
Matthieu Genais
Elissar Nassereddine
Raphael Mourad
DeepG4: A deep learning approach to predict cell-type specific active G-quadruplex regions.
description DNA is a complex molecule carrying the instructions an organism needs to develop, live and reproduce. In 1953, Watson and Crick discovered that DNA is composed of two chains forming a double-helix. Later on, other structures of DNA were discovered and shown to play important roles in the cell, in particular G-quadruplex (G4). Following genome sequencing, several bioinformatic algorithms were developed to map G4s in vitro based on a canonical sequence motif, G-richness and G-skewness or alternatively sequence features including k-mers, and more recently machine/deep learning. Recently, new sequencing techniques were developed to map G4s in vitro (G4-seq) and G4s in vivo (G4 ChIP-seq) at few hundred base resolution. Here, we propose a novel convolutional neural network (DeepG4) to map cell-type specific active G4 regions (e.g. regions within which G4s form both in vitro and in vivo). DeepG4 is very accurate to predict active G4 regions in different cell types. Moreover, DeepG4 identifies key DNA motifs that are predictive of G4 region activity. We found that such motifs do not follow a very flexible sequence pattern as current algorithms seek for. Instead, active G4 regions are determined by numerous specific motifs. Moreover, among those motifs, we identified known transcription factors (TFs) which could play important roles in G4 activity by contributing either directly to G4 structures themselves or indirectly by participating in G4 formation in the vicinity. In addition, we used DeepG4 to predict active G4 regions in a large number of tissues and cancers, thereby providing a comprehensive resource for researchers. Availability: https://github.com/morphos30/DeepG4.
format article
author Vincent Rocher
Matthieu Genais
Elissar Nassereddine
Raphael Mourad
author_facet Vincent Rocher
Matthieu Genais
Elissar Nassereddine
Raphael Mourad
author_sort Vincent Rocher
title DeepG4: A deep learning approach to predict cell-type specific active G-quadruplex regions.
title_short DeepG4: A deep learning approach to predict cell-type specific active G-quadruplex regions.
title_full DeepG4: A deep learning approach to predict cell-type specific active G-quadruplex regions.
title_fullStr DeepG4: A deep learning approach to predict cell-type specific active G-quadruplex regions.
title_full_unstemmed DeepG4: A deep learning approach to predict cell-type specific active G-quadruplex regions.
title_sort deepg4: a deep learning approach to predict cell-type specific active g-quadruplex regions.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/fa6917e498234847a88643103b214c39
work_keys_str_mv AT vincentrocher deepg4adeeplearningapproachtopredictcelltypespecificactivegquadruplexregions
AT matthieugenais deepg4adeeplearningapproachtopredictcelltypespecificactivegquadruplexregions
AT elissarnassereddine deepg4adeeplearningapproachtopredictcelltypespecificactivegquadruplexregions
AT raphaelmourad deepg4adeeplearningapproachtopredictcelltypespecificactivegquadruplexregions
_version_ 1718375817252175872