RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.

Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functiona...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Rasna R Walia, Li C Xue, Katherine Wilkins, Yasser El-Manzalawy, Drena Dobbs, Vasant Honavar
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2014
Materias:
R
Q
Acceso en línea:https://doaj.org/article/eedb96f32dfa4f739f9b293cf7cfdb97
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:eedb96f32dfa4f739f9b293cf7cfdb97
record_format dspace
spelling oai:doaj.org-article:eedb96f32dfa4f739f9b293cf7cfdb972021-11-18T08:18:32ZRNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.1932-620310.1371/journal.pone.0097725https://doaj.org/article/eedb96f32dfa4f739f9b293cf7cfdb972014-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24846307/?tool=EBIhttps://doaj.org/toc/1932-6203Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functional implications of such interactions and for developing effective approaches to rational drug design. Sequence-based computational methods offer a viable, cost-effective way to identify putative RNA-binding residues in RNA-binding proteins. Here we report two novel approaches: (i) HomPRIP, a sequence homology-based method for predicting RNA-binding sites in proteins; (ii) RNABindRPlus, a new method that combines predictions from HomPRIP with those from an optimized Support Vector Machine (SVM) classifier trained on a benchmark dataset of 198 RNA-binding proteins. Although highly reliable, HomPRIP cannot make predictions for the unaligned parts of query proteins and its coverage is limited by the availability of close sequence homologs of the query protein with experimentally determined RNA-binding sites. RNABindRPlus overcomes these limitations. We compared the performance of HomPRIP and RNABindRPlus with that of several state-of-the-art predictors on two test sets, RB44 and RB111. On a subset of proteins for which homologs with experimentally determined interfaces could be reliably identified, HomPRIP outperformed all other methods achieving an MCC of 0.63 on RB44 and 0.83 on RB111. RNABindRPlus was able to predict RNA-binding residues of all proteins in both test sets, achieving an MCC of 0.55 and 0.37, respectively, and outperforming all other methods, including those that make use of structure-derived features of proteins. More importantly, RNABindRPlus outperforms all other methods for any choice of tradeoff between precision and recall. An important advantage of both HomPRIP and RNABindRPlus is that they rely on readily available sequence and sequence-derived features of RNA-binding proteins. A webserver implementation of both methods is freely available at http://einstein.cs.iastate.edu/RNABindRPlus/.Rasna R WaliaLi C XueKatherine WilkinsYasser El-ManzalawyDrena DobbsVasant HonavarPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 9, Iss 5, p e97725 (2014)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Rasna R Walia
Li C Xue
Katherine Wilkins
Yasser El-Manzalawy
Drena Dobbs
Vasant Honavar
RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.
description Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functional implications of such interactions and for developing effective approaches to rational drug design. Sequence-based computational methods offer a viable, cost-effective way to identify putative RNA-binding residues in RNA-binding proteins. Here we report two novel approaches: (i) HomPRIP, a sequence homology-based method for predicting RNA-binding sites in proteins; (ii) RNABindRPlus, a new method that combines predictions from HomPRIP with those from an optimized Support Vector Machine (SVM) classifier trained on a benchmark dataset of 198 RNA-binding proteins. Although highly reliable, HomPRIP cannot make predictions for the unaligned parts of query proteins and its coverage is limited by the availability of close sequence homologs of the query protein with experimentally determined RNA-binding sites. RNABindRPlus overcomes these limitations. We compared the performance of HomPRIP and RNABindRPlus with that of several state-of-the-art predictors on two test sets, RB44 and RB111. On a subset of proteins for which homologs with experimentally determined interfaces could be reliably identified, HomPRIP outperformed all other methods achieving an MCC of 0.63 on RB44 and 0.83 on RB111. RNABindRPlus was able to predict RNA-binding residues of all proteins in both test sets, achieving an MCC of 0.55 and 0.37, respectively, and outperforming all other methods, including those that make use of structure-derived features of proteins. More importantly, RNABindRPlus outperforms all other methods for any choice of tradeoff between precision and recall. An important advantage of both HomPRIP and RNABindRPlus is that they rely on readily available sequence and sequence-derived features of RNA-binding proteins. A webserver implementation of both methods is freely available at http://einstein.cs.iastate.edu/RNABindRPlus/.
format article
author Rasna R Walia
Li C Xue
Katherine Wilkins
Yasser El-Manzalawy
Drena Dobbs
Vasant Honavar
author_facet Rasna R Walia
Li C Xue
Katherine Wilkins
Yasser El-Manzalawy
Drena Dobbs
Vasant Honavar
author_sort Rasna R Walia
title RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.
title_short RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.
title_full RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.
title_fullStr RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.
title_full_unstemmed RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.
title_sort rnabindrplus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted rna-binding residues in proteins.
publisher Public Library of Science (PLoS)
publishDate 2014
url https://doaj.org/article/eedb96f32dfa4f739f9b293cf7cfdb97
work_keys_str_mv AT rasnarwalia rnabindrplusapredictorthatcombinesmachinelearningandsequencehomologybasedmethodstoimprovethereliabilityofpredictedrnabindingresiduesinproteins
AT licxue rnabindrplusapredictorthatcombinesmachinelearningandsequencehomologybasedmethodstoimprovethereliabilityofpredictedrnabindingresiduesinproteins
AT katherinewilkins rnabindrplusapredictorthatcombinesmachinelearningandsequencehomologybasedmethodstoimprovethereliabilityofpredictedrnabindingresiduesinproteins
AT yasserelmanzalawy rnabindrplusapredictorthatcombinesmachinelearningandsequencehomologybasedmethodstoimprovethereliabilityofpredictedrnabindingresiduesinproteins
AT drenadobbs rnabindrplusapredictorthatcombinesmachinelearningandsequencehomologybasedmethodstoimprovethereliabilityofpredictedrnabindingresiduesinproteins
AT vasanthonavar rnabindrplusapredictorthatcombinesmachinelearningandsequencehomologybasedmethodstoimprovethereliabilityofpredictedrnabindingresiduesinproteins
_version_ 1718421958380486656