The main aim of this proposal is to develop a comprehensive quantitative evolutionary theory of structure-function relationship in proteins. To this end, a novel approach to study proteins is proposed based on graph theory, whereby the whole universe of all protein domains is organized into a graph (PDUG), based on structural functional or metaboilic participation similarities. This provides a multidimensional description of proteins at the level of all existing domains or whole proteoms in specific organisms. Using graph theory to analyze structural, functional and metabolic protein domain universes makes it possible to get unique insights into the evolutionary origin of proteins and the cause of their diversity. Comparing protein domain universe graphs from different organisms helps to create a new paradigm in phylogeny, whereby the tree of life is built based, not on specific genes or RNA molecules, but on whole proteoms taken in multidimensional space of structural, functional and metabolic relationships. Furthermore, the analysis of robust properties protein domain universe graphs makes it possible to develop testable dynamic models of protein evolution that encompass a range of evolutionary time scales from single mutations to the evolution of organisms. The research plan encompasses several crucial steps to achieve these specific aims. First, a new quantitative graph theoretical description of functional and metabolic relationships between proteins will be developed. It will be based on hierarchical description of functional and metabolic annotation of proteins, and will use markov models to quantify the distances in functional and metabolic spaces, as well as to quantify functional distances between enzymes via graph based similarity comparisons between their metabolites. Using these new quantitative descriptions, multidimentional protein domain universe graphs will be constructed and each will be clustered into disjoint clusters of structurally, functionally and metabolically similar proteins. Overlap between these clusters provides the extent of structure-function relationship and will also relate functions of proteins with their participation in particular metabolic pathways. By creating multidimensional protein domain universe graphs for various organisms, we first will evaluate the degree of participation of various structural and functional templates in different organisms, and by comparing those, we will create a comprehensive tree of life that will shed light on major evolutionary events. These findings will be applied to enhance our ability to predict structure and function of novel proteins leading to possible therapeutical applications. Our findings will be available to the scientific community via the ELISA database.
Showing the most recent 10 out of 33 publications