TYLER, a fast method that accurately predicts cyclin-dependent proteins by using computation-based motifs and sequence-derived features

Cyclins and related cyclin-dependent kinases play vital roles in regulating the progression in the cell cycle. Understanding the intrinsic mechanisms of cyclins promises knowledge about cell uncontrolled proliferation and prevention of cancer cells. Therefore, accurate recognition of cyclins is impo...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Jian Zhang, Xingchen Liang, Feng Zhou, Bo Li, Yanling Li
Formato: article
Lenguaje:EN
Publicado: AIMS Press 2021
Materias:
Acceso en línea:https://doaj.org/article/9dadfab7b444406a9311a61dbffdb647
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Cyclins and related cyclin-dependent kinases play vital roles in regulating the progression in the cell cycle. Understanding the intrinsic mechanisms of cyclins promises knowledge about cell uncontrolled proliferation and prevention of cancer cells. Therefore, accurate recognition of cyclins is important for the investigation of tumor cells and biomedical engineering. This study proposes a novel sequence-based predictor named TYLER (predicT cYcLin-dEpendent pRoteins) for addressing the long challenge problem of predicting cyclin-dependent proteins (CDPs). We use information theory to compute selectively enriched CDP-related motifs and build the motif-based model. For those proteins without sharing enriched motifs, we compute sequence-derived features and construct machine learning-based models. We optimize the weights of two different models to build a more accurate predictor. We estimate these two types of models by using 5-fold cross-validations on the TRAINING dataset. We prove that the combination of two models and optimization of the corresponding weights promises decent and robust results on both TRAINING and independent TEST dataset. The empirical test demonstrates that TYLER is robust predictor and statistically significantly better than current methods. The runtime assessment reveals TYLER is a high-throughput effective method. We use TYLER to make predictions on the human proteome, and use the results to hypothesize CDPs. The latest experimental verified CDPs and GO analysis proves that some of our novel predictions shall be potential CDPs. TYLER is implemented as a public user-friendly web server at http://www.inforstation.com/webservers/TYLER/. We share all data and source code that used in this research at https://github.com/biocomputinglab/TYLER.git.