FFPred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences.

To understand fully cell behaviour, biologists are making progress towards cataloguing the functional elements in the human genome and characterising their roles across a variety of tissues and conditions. Yet, functional information - either experimentally validated or computationally inferred by s...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Federico Minneci, Damiano Piovesan, Domenico Cozzetto, David T Jones
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2013
Materias:
R
Q
Acceso en línea:https://doaj.org/article/29946e47243e4c09b0d5d2292a839ce8
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:29946e47243e4c09b0d5d2292a839ce8
record_format dspace
spelling oai:doaj.org-article:29946e47243e4c09b0d5d2292a839ce82021-11-18T07:44:41ZFFPred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences.1932-620310.1371/journal.pone.0063754https://doaj.org/article/29946e47243e4c09b0d5d2292a839ce82013-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23717476/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203To understand fully cell behaviour, biologists are making progress towards cataloguing the functional elements in the human genome and characterising their roles across a variety of tissues and conditions. Yet, functional information - either experimentally validated or computationally inferred by similarity - remains completely missing for approximately 30% of human proteins. FFPred was initially developed to bridge this gap by targeting sequences with distant or no homologues of known function and by exploiting clear patterns of intrinsic disorder associated with particular molecular activities and biological processes. Here, we present an updated and improved version, which builds on larger datasets of protein sequences and annotations, and uses updated component feature predictors as well as revised training procedures. FFPred 2.0 includes support vector regression models for the prediction of 442 Gene Ontology (GO) terms, which largely expand the coverage of the ontology and of the biological process category in particular. The GO term list mainly revolves around macromolecular interactions and their role in regulatory, signalling, developmental and metabolic processes. Benchmarking experiments on newly annotated proteins show that FFPred 2.0 provides more accurate functional assignments than its predecessor and the ProtFun server do; also, its assignments can complement information obtained using BLAST-based transfer of annotations, improving especially prediction in the biological process category. Furthermore, FFPred 2.0 can be used to annotate proteins belonging to several eukaryotic organisms with a limited decrease in prediction quality. We illustrate all these points through the use of both precision-recall plots and of the COGIC scores, which we recently proposed as an alternative numerical evaluation measure of function prediction accuracy.Federico MinneciDamiano PiovesanDomenico CozzettoDavid T JonesPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 8, Iss 5, p e63754 (2013)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Federico Minneci
Damiano Piovesan
Domenico Cozzetto
David T Jones
FFPred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences.
description To understand fully cell behaviour, biologists are making progress towards cataloguing the functional elements in the human genome and characterising their roles across a variety of tissues and conditions. Yet, functional information - either experimentally validated or computationally inferred by similarity - remains completely missing for approximately 30% of human proteins. FFPred was initially developed to bridge this gap by targeting sequences with distant or no homologues of known function and by exploiting clear patterns of intrinsic disorder associated with particular molecular activities and biological processes. Here, we present an updated and improved version, which builds on larger datasets of protein sequences and annotations, and uses updated component feature predictors as well as revised training procedures. FFPred 2.0 includes support vector regression models for the prediction of 442 Gene Ontology (GO) terms, which largely expand the coverage of the ontology and of the biological process category in particular. The GO term list mainly revolves around macromolecular interactions and their role in regulatory, signalling, developmental and metabolic processes. Benchmarking experiments on newly annotated proteins show that FFPred 2.0 provides more accurate functional assignments than its predecessor and the ProtFun server do; also, its assignments can complement information obtained using BLAST-based transfer of annotations, improving especially prediction in the biological process category. Furthermore, FFPred 2.0 can be used to annotate proteins belonging to several eukaryotic organisms with a limited decrease in prediction quality. We illustrate all these points through the use of both precision-recall plots and of the COGIC scores, which we recently proposed as an alternative numerical evaluation measure of function prediction accuracy.
format article
author Federico Minneci
Damiano Piovesan
Domenico Cozzetto
David T Jones
author_facet Federico Minneci
Damiano Piovesan
Domenico Cozzetto
David T Jones
author_sort Federico Minneci
title FFPred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences.
title_short FFPred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences.
title_full FFPred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences.
title_fullStr FFPred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences.
title_full_unstemmed FFPred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences.
title_sort ffpred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences.
publisher Public Library of Science (PLoS)
publishDate 2013
url https://doaj.org/article/29946e47243e4c09b0d5d2292a839ce8
work_keys_str_mv AT federicominneci ffpred20improvedhomologyindependentpredictionofgeneontologytermsforeukaryoticproteinsequences
AT damianopiovesan ffpred20improvedhomologyindependentpredictionofgeneontologytermsforeukaryoticproteinsequences
AT domenicocozzetto ffpred20improvedhomologyindependentpredictionofgeneontologytermsforeukaryoticproteinsequences
AT davidtjones ffpred20improvedhomologyindependentpredictionofgeneontologytermsforeukaryoticproteinsequences
_version_ 1718423024420519936