Identifying a high fraction of the human genome to be under selective constraint using GERP++.

Computational efforts to identify functional elements within genomes leverage comparative sequence information by looking for regions that exhibit evidence of selective constraint. One way of detecting constrained elements is to follow a bottom-up approach by computing constraint scores for individu...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Eugene V Davydov, David L Goode, Marina Sirota, Gregory M Cooper, Arend Sidow, Serafim Batzoglou
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2010
Materias:
Acceso en línea:https://doaj.org/article/c3537d52341745768f626395e71118d1
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:c3537d52341745768f626395e71118d1
record_format dspace
spelling oai:doaj.org-article:c3537d52341745768f626395e71118d12021-11-18T05:50:50ZIdentifying a high fraction of the human genome to be under selective constraint using GERP++.1553-734X1553-735810.1371/journal.pcbi.1001025https://doaj.org/article/c3537d52341745768f626395e71118d12010-12-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21152010/pdf/?tool=EBIhttps://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Computational efforts to identify functional elements within genomes leverage comparative sequence information by looking for regions that exhibit evidence of selective constraint. One way of detecting constrained elements is to follow a bottom-up approach by computing constraint scores for individual positions of a multiple alignment and then defining constrained elements as segments of contiguous, highly scoring nucleotide positions. Here we present GERP++, a new tool that uses maximum likelihood evolutionary rate estimation for position-specific scoring and, in contrast to previous bottom-up methods, a novel dynamic programming approach to subsequently define constrained elements. GERP++ evaluates a richer set of candidate element breakpoints and ranks them based on statistical significance, eliminating the need for biased heuristic extension techniques. Using GERP++ we identify over 1.3 million constrained elements spanning over 7% of the human genome. We predict a higher fraction than earlier estimates largely due to the annotation of longer constrained elements, which improves one to one correspondence between predicted elements with known functional sequences. GERP++ is an efficient and effective tool to provide both nucleotide- and element-level constraint scores within deep multiple sequence alignments.Eugene V DavydovDavid L GoodeMarina SirotaGregory M CooperArend SidowSerafim BatzoglouPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 6, Iss 12, p e1001025 (2010)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Eugene V Davydov
David L Goode
Marina Sirota
Gregory M Cooper
Arend Sidow
Serafim Batzoglou
Identifying a high fraction of the human genome to be under selective constraint using GERP++.
description Computational efforts to identify functional elements within genomes leverage comparative sequence information by looking for regions that exhibit evidence of selective constraint. One way of detecting constrained elements is to follow a bottom-up approach by computing constraint scores for individual positions of a multiple alignment and then defining constrained elements as segments of contiguous, highly scoring nucleotide positions. Here we present GERP++, a new tool that uses maximum likelihood evolutionary rate estimation for position-specific scoring and, in contrast to previous bottom-up methods, a novel dynamic programming approach to subsequently define constrained elements. GERP++ evaluates a richer set of candidate element breakpoints and ranks them based on statistical significance, eliminating the need for biased heuristic extension techniques. Using GERP++ we identify over 1.3 million constrained elements spanning over 7% of the human genome. We predict a higher fraction than earlier estimates largely due to the annotation of longer constrained elements, which improves one to one correspondence between predicted elements with known functional sequences. GERP++ is an efficient and effective tool to provide both nucleotide- and element-level constraint scores within deep multiple sequence alignments.
format article
author Eugene V Davydov
David L Goode
Marina Sirota
Gregory M Cooper
Arend Sidow
Serafim Batzoglou
author_facet Eugene V Davydov
David L Goode
Marina Sirota
Gregory M Cooper
Arend Sidow
Serafim Batzoglou
author_sort Eugene V Davydov
title Identifying a high fraction of the human genome to be under selective constraint using GERP++.
title_short Identifying a high fraction of the human genome to be under selective constraint using GERP++.
title_full Identifying a high fraction of the human genome to be under selective constraint using GERP++.
title_fullStr Identifying a high fraction of the human genome to be under selective constraint using GERP++.
title_full_unstemmed Identifying a high fraction of the human genome to be under selective constraint using GERP++.
title_sort identifying a high fraction of the human genome to be under selective constraint using gerp++.
publisher Public Library of Science (PLoS)
publishDate 2010
url https://doaj.org/article/c3537d52341745768f626395e71118d1
work_keys_str_mv AT eugenevdavydov identifyingahighfractionofthehumangenometobeunderselectiveconstraintusinggerp
AT davidlgoode identifyingahighfractionofthehumangenometobeunderselectiveconstraintusinggerp
AT marinasirota identifyingahighfractionofthehumangenometobeunderselectiveconstraintusinggerp
AT gregorymcooper identifyingahighfractionofthehumangenometobeunderselectiveconstraintusinggerp
AT arendsidow identifyingahighfractionofthehumangenometobeunderselectiveconstraintusinggerp
AT serafimbatzoglou identifyingahighfractionofthehumangenometobeunderselectiveconstraintusinggerp
_version_ 1718424812843433984