Prediction of protein-protein interaction sites by random forest algorithm with mRMR and IFS.
Prediction of protein-protein interaction (PPI) sites is one of the most challenging problems in computational biology. Although great progress has been made by employing various machine learning approaches with numerous characteristic features, the problem is still far from being solved. In this st...
Guardado en:
Autores principales: | , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2012
|
Materias: | |
Acceso en línea: | https://doaj.org/article/e8eab8813c9f453a958baa213f02f01d |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:e8eab8813c9f453a958baa213f02f01d |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:e8eab8813c9f453a958baa213f02f01d2021-11-18T07:07:27ZPrediction of protein-protein interaction sites by random forest algorithm with mRMR and IFS.1932-620310.1371/journal.pone.0043927https://doaj.org/article/e8eab8813c9f453a958baa213f02f01d2012-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/22937126/?tool=EBIhttps://doaj.org/toc/1932-6203Prediction of protein-protein interaction (PPI) sites is one of the most challenging problems in computational biology. Although great progress has been made by employing various machine learning approaches with numerous characteristic features, the problem is still far from being solved. In this study, we developed a novel predictor based on Random Forest (RF) algorithm with the Minimum Redundancy Maximal Relevance (mRMR) method followed by incremental feature selection (IFS). We incorporated features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure and solvent accessibility. We also included five 3D structural features to predict protein-protein interaction sites and achieved an overall accuracy of 0.672997 and MCC of 0.347977. Feature analysis showed that 3D structural features such as Depth Index (DPX) and surface curvature (SC) contributed most to the prediction of protein-protein interaction sites. It was also shown via site-specific feature analysis that the features of individual residues from PPI sites contribute most to the determination of protein-protein interaction sites. It is anticipated that our prediction method will become a useful tool for identifying PPI sites, and that the feature analysis described in this paper will provide useful insights into the mechanisms of interaction.Bi-Qing LiKai-Yan FengLei ChenTao HuangYu-Dong CaiPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 7, Iss 8, p e43927 (2012) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Bi-Qing Li Kai-Yan Feng Lei Chen Tao Huang Yu-Dong Cai Prediction of protein-protein interaction sites by random forest algorithm with mRMR and IFS. |
description |
Prediction of protein-protein interaction (PPI) sites is one of the most challenging problems in computational biology. Although great progress has been made by employing various machine learning approaches with numerous characteristic features, the problem is still far from being solved. In this study, we developed a novel predictor based on Random Forest (RF) algorithm with the Minimum Redundancy Maximal Relevance (mRMR) method followed by incremental feature selection (IFS). We incorporated features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure and solvent accessibility. We also included five 3D structural features to predict protein-protein interaction sites and achieved an overall accuracy of 0.672997 and MCC of 0.347977. Feature analysis showed that 3D structural features such as Depth Index (DPX) and surface curvature (SC) contributed most to the prediction of protein-protein interaction sites. It was also shown via site-specific feature analysis that the features of individual residues from PPI sites contribute most to the determination of protein-protein interaction sites. It is anticipated that our prediction method will become a useful tool for identifying PPI sites, and that the feature analysis described in this paper will provide useful insights into the mechanisms of interaction. |
format |
article |
author |
Bi-Qing Li Kai-Yan Feng Lei Chen Tao Huang Yu-Dong Cai |
author_facet |
Bi-Qing Li Kai-Yan Feng Lei Chen Tao Huang Yu-Dong Cai |
author_sort |
Bi-Qing Li |
title |
Prediction of protein-protein interaction sites by random forest algorithm with mRMR and IFS. |
title_short |
Prediction of protein-protein interaction sites by random forest algorithm with mRMR and IFS. |
title_full |
Prediction of protein-protein interaction sites by random forest algorithm with mRMR and IFS. |
title_fullStr |
Prediction of protein-protein interaction sites by random forest algorithm with mRMR and IFS. |
title_full_unstemmed |
Prediction of protein-protein interaction sites by random forest algorithm with mRMR and IFS. |
title_sort |
prediction of protein-protein interaction sites by random forest algorithm with mrmr and ifs. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2012 |
url |
https://doaj.org/article/e8eab8813c9f453a958baa213f02f01d |
work_keys_str_mv |
AT biqingli predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs AT kaiyanfeng predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs AT leichen predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs AT taohuang predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs AT yudongcai predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs |
_version_ |
1718423950279573504 |