Neighbor-dependent Ramachandran probability distributions of amino acids developed from a hierarchical Dirichlet process model.

Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distribut...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Daniel Ting, Guoli Wang, Maxim Shapovalov, Rajib Mitra, Michael I Jordan, Roland L Dunbrack
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2010
Materias:
Acceso en línea:https://doaj.org/article/1cf6d30617ee43cbad901d1b339e3f60
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:1cf6d30617ee43cbad901d1b339e3f60
record_format dspace
spelling oai:doaj.org-article:1cf6d30617ee43cbad901d1b339e3f602021-12-02T19:58:27ZNeighbor-dependent Ramachandran probability distributions of amino acids developed from a hierarchical Dirichlet process model.1553-734X1553-735810.1371/journal.pcbi.1000763https://doaj.org/article/1cf6d30617ee43cbad901d1b339e3f602010-04-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/20442867/pdf/?tool=EBIhttps://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distributions differ in a number of important ways, which determine their usefulness for various purposes. These include: 1) input data size and criteria for structure inclusion (resolution, R-factor, etc.); 2) filtering of suspect conformations and outliers using B-factors or other features; 3) secondary structure of input data (e.g., whether helix and sheet are included; whether beta turns are included); 4) the method used for determining probability densities ranging from simple histograms to modern nonparametric density estimation; and 5) whether they include nearest neighbor effects on the distribution of conformations in different regions of the Ramachandran map. In this work, Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities. Distributions for all 20 amino acids (with cis and trans proline treated separately) have been determined, as well as 420 left-neighbor and 420 right-neighbor dependent distributions. The neighbor-independent and neighbor-dependent probability densities have been accurately estimated using Bayesian nonparametric statistical analysis based on the Dirichlet process. In particular, we used hierarchical Dirichlet process priors, which allow sharing of information between densities for a particular residue type and different neighbor residue types. The resulting distributions are tested in a loop modeling benchmark with the program Rosetta, and are shown to improve protein loop conformation prediction significantly. The distributions are available at http://dunbrack.fccc.edu/hdp.Daniel TingGuoli WangMaxim ShapovalovRajib MitraMichael I JordanRoland L DunbrackPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 6, Iss 4, p e1000763 (2010)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Daniel Ting
Guoli Wang
Maxim Shapovalov
Rajib Mitra
Michael I Jordan
Roland L Dunbrack
Neighbor-dependent Ramachandran probability distributions of amino acids developed from a hierarchical Dirichlet process model.
description Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distributions differ in a number of important ways, which determine their usefulness for various purposes. These include: 1) input data size and criteria for structure inclusion (resolution, R-factor, etc.); 2) filtering of suspect conformations and outliers using B-factors or other features; 3) secondary structure of input data (e.g., whether helix and sheet are included; whether beta turns are included); 4) the method used for determining probability densities ranging from simple histograms to modern nonparametric density estimation; and 5) whether they include nearest neighbor effects on the distribution of conformations in different regions of the Ramachandran map. In this work, Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities. Distributions for all 20 amino acids (with cis and trans proline treated separately) have been determined, as well as 420 left-neighbor and 420 right-neighbor dependent distributions. The neighbor-independent and neighbor-dependent probability densities have been accurately estimated using Bayesian nonparametric statistical analysis based on the Dirichlet process. In particular, we used hierarchical Dirichlet process priors, which allow sharing of information between densities for a particular residue type and different neighbor residue types. The resulting distributions are tested in a loop modeling benchmark with the program Rosetta, and are shown to improve protein loop conformation prediction significantly. The distributions are available at http://dunbrack.fccc.edu/hdp.
format article
author Daniel Ting
Guoli Wang
Maxim Shapovalov
Rajib Mitra
Michael I Jordan
Roland L Dunbrack
author_facet Daniel Ting
Guoli Wang
Maxim Shapovalov
Rajib Mitra
Michael I Jordan
Roland L Dunbrack
author_sort Daniel Ting
title Neighbor-dependent Ramachandran probability distributions of amino acids developed from a hierarchical Dirichlet process model.
title_short Neighbor-dependent Ramachandran probability distributions of amino acids developed from a hierarchical Dirichlet process model.
title_full Neighbor-dependent Ramachandran probability distributions of amino acids developed from a hierarchical Dirichlet process model.
title_fullStr Neighbor-dependent Ramachandran probability distributions of amino acids developed from a hierarchical Dirichlet process model.
title_full_unstemmed Neighbor-dependent Ramachandran probability distributions of amino acids developed from a hierarchical Dirichlet process model.
title_sort neighbor-dependent ramachandran probability distributions of amino acids developed from a hierarchical dirichlet process model.
publisher Public Library of Science (PLoS)
publishDate 2010
url https://doaj.org/article/1cf6d30617ee43cbad901d1b339e3f60
work_keys_str_mv AT danielting neighbordependentramachandranprobabilitydistributionsofaminoacidsdevelopedfromahierarchicaldirichletprocessmodel
AT guoliwang neighbordependentramachandranprobabilitydistributionsofaminoacidsdevelopedfromahierarchicaldirichletprocessmodel
AT maximshapovalov neighbordependentramachandranprobabilitydistributionsofaminoacidsdevelopedfromahierarchicaldirichletprocessmodel
AT rajibmitra neighbordependentramachandranprobabilitydistributionsofaminoacidsdevelopedfromahierarchicaldirichletprocessmodel
AT michaelijordan neighbordependentramachandranprobabilitydistributionsofaminoacidsdevelopedfromahierarchicaldirichletprocessmodel
AT rolandldunbrack neighbordependentramachandranprobabilitydistributionsofaminoacidsdevelopedfromahierarchicaldirichletprocessmodel
_version_ 1718375766345908224