Testing the ortholog conjecture with comparative functional genomic data from mammals.

A common assumption in comparative genomics is that orthologous genes share greater functional similarity than do paralogous genes (the "ortholog conjecture"). Many methods used to computationally predict protein function are based on this assumption, even though it is largely untested. He...

Full description

Saved in:
Bibliographic Details
Main Authors: Nathan L Nehrt, Wyatt T Clark, Predrag Radivojac, Matthew W Hahn
Format: article
Language:EN
Published: Public Library of Science (PLoS) 2011
Subjects:
Online Access:https://doaj.org/article/1081c89f4e7944239ea8d61ff2938bb7
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A common assumption in comparative genomics is that orthologous genes share greater functional similarity than do paralogous genes (the "ortholog conjecture"). Many methods used to computationally predict protein function are based on this assumption, even though it is largely untested. Here we present the first large-scale test of the ortholog conjecture using comparative functional genomic data from human and mouse. We use the experimentally derived functions of more than 8,900 genes, as well as an independent microarray dataset, to directly assess our ability to predict function using both orthologs and paralogs. Both datasets show that paralogs are often a much better predictor of function than are orthologs, even at lower sequence identities. Among paralogs, those found within the same species are consistently more functionally similar than those found in a different species. We also find that paralogous pairs residing on the same chromosome are more functionally similar than those on different chromosomes, perhaps due to higher levels of interlocus gene conversion between these pairs. In addition to offering implications for the computational prediction of protein function, our results shed light on the relationship between sequence divergence and functional divergence. We conclude that the most important factor in the evolution of function is not amino acid sequence, but rather the cellular context in which proteins act.