PANDA: Protein function prediction using domain architecture and affinity propagation

Abstract We developed PANDA (Propagation of Affinity and Domain Architecture) to predict protein functions in the format of Gene Ontology (GO) terms. PANDA at first executes profile-profile alignment algorithm to search against PfamA, KOG, COG, and SwissProt databases, and then launches PSI-BLAST ag...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Zheng Wang, Chenguang Zhao, Yiheng Wang, Zheng Sun, Nan Wang
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2018
Materias:
R
Q
Acceso en línea:https://doaj.org/article/b61f5509c5fb4459a2857b45843d1ba2
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:b61f5509c5fb4459a2857b45843d1ba2
record_format dspace
spelling oai:doaj.org-article:b61f5509c5fb4459a2857b45843d1ba22021-12-02T15:08:53ZPANDA: Protein function prediction using domain architecture and affinity propagation10.1038/s41598-018-21849-12045-2322https://doaj.org/article/b61f5509c5fb4459a2857b45843d1ba22018-02-01T00:00:00Zhttps://doi.org/10.1038/s41598-018-21849-1https://doaj.org/toc/2045-2322Abstract We developed PANDA (Propagation of Affinity and Domain Architecture) to predict protein functions in the format of Gene Ontology (GO) terms. PANDA at first executes profile-profile alignment algorithm to search against PfamA, KOG, COG, and SwissProt databases, and then launches PSI-BLAST against UniProt for homologue search. PANDA integrates a domain architecture inference algorithm based on the Bayesian statistics that calculates the probability of having a GO term. All the candidate GO terms are pooled and filtered based on Z-score. After that, the remaining GO terms are clustered using an affinity propagation algorithm based on the GO directed acyclic graph, followed by a second round of filtering on the clusters of GO terms. We benchmarked the performance of all the baseline predictors PANDA integrates and also for every pooling and filtering step of PANDA. It can be found that PANDA achieves better performances in terms of area under the curve for precision and recall compared to the baseline predictors. PANDA can be accessed from http://dna.cs.miami.edu/PANDA/.Zheng WangChenguang ZhaoYiheng WangZheng SunNan WangNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 8, Iss 1, Pp 1-10 (2018)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Zheng Wang
Chenguang Zhao
Yiheng Wang
Zheng Sun
Nan Wang
PANDA: Protein function prediction using domain architecture and affinity propagation
description Abstract We developed PANDA (Propagation of Affinity and Domain Architecture) to predict protein functions in the format of Gene Ontology (GO) terms. PANDA at first executes profile-profile alignment algorithm to search against PfamA, KOG, COG, and SwissProt databases, and then launches PSI-BLAST against UniProt for homologue search. PANDA integrates a domain architecture inference algorithm based on the Bayesian statistics that calculates the probability of having a GO term. All the candidate GO terms are pooled and filtered based on Z-score. After that, the remaining GO terms are clustered using an affinity propagation algorithm based on the GO directed acyclic graph, followed by a second round of filtering on the clusters of GO terms. We benchmarked the performance of all the baseline predictors PANDA integrates and also for every pooling and filtering step of PANDA. It can be found that PANDA achieves better performances in terms of area under the curve for precision and recall compared to the baseline predictors. PANDA can be accessed from http://dna.cs.miami.edu/PANDA/.
format article
author Zheng Wang
Chenguang Zhao
Yiheng Wang
Zheng Sun
Nan Wang
author_facet Zheng Wang
Chenguang Zhao
Yiheng Wang
Zheng Sun
Nan Wang
author_sort Zheng Wang
title PANDA: Protein function prediction using domain architecture and affinity propagation
title_short PANDA: Protein function prediction using domain architecture and affinity propagation
title_full PANDA: Protein function prediction using domain architecture and affinity propagation
title_fullStr PANDA: Protein function prediction using domain architecture and affinity propagation
title_full_unstemmed PANDA: Protein function prediction using domain architecture and affinity propagation
title_sort panda: protein function prediction using domain architecture and affinity propagation
publisher Nature Portfolio
publishDate 2018
url https://doaj.org/article/b61f5509c5fb4459a2857b45843d1ba2
work_keys_str_mv AT zhengwang pandaproteinfunctionpredictionusingdomainarchitectureandaffinitypropagation
AT chenguangzhao pandaproteinfunctionpredictionusingdomainarchitectureandaffinitypropagation
AT yihengwang pandaproteinfunctionpredictionusingdomainarchitectureandaffinitypropagation
AT zhengsun pandaproteinfunctionpredictionusingdomainarchitectureandaffinitypropagation
AT nanwang pandaproteinfunctionpredictionusingdomainarchitectureandaffinitypropagation
_version_ 1718388021742534656