Rapid advances in biotechnologies are amassing biological interaction data, such as protein-protein and gene-gene interaction networks, at unprecedented pace and rate, presenting a new powerful resource and allowing the reformulation of old, yet important, biological questions in a new context. The size and complexity of these new types of data pose great challenges for experimental and computational biologists alike. Addressing these challenges has been a primary focus of much research under the umbrella term of systems biology. However, almost no work has been done on providing tools for simultaneous evolutionary analysis of genomic and interactomic data. This project will delineate the significant impact such a simultaneous analysis can have on understanding and analyzing biological interaction networks, and will explore new methodologies for conducting computational analyses. In particular, two areas will be addressed that will help shed light on interaction networks and their complexity:

1. Novel genome-interactome evolutionary models. Coalescent theory has been one of the central models for establishing the relationships among gene genealogies and species phylogenies. In its current form this theory neither allows for modeling events that arise in genomic studies, such as gene duplication and loss, nor has it been used to explain interaction network evolution. This research will extend coalescent theory to model genome-scale evolutionary events, and develop a new unified framework for modeling the simultaneous evolution of genomic and interactomic data.

2. Novel stochastic modeling and inference using graph grammars. Stochastic models, such as hidden Markov models and stochastic context-free grammars, have been used extensively in the analysis of biological sequence data. However, no equivalent models have been introduced for analysis of interaction networks. This research will explore new applications of stochastic graph grammars, as well as ways in which these stochastic models can be used to provide insightful analyses of these networks.

Broad Impact

Situated at the intersection of cellular, molecular, and evolutionary biology, this work will have a significant impact on the development and applications of computational tools such as stochastic graph grammars and dissimilarity measures. The project will provide opportunities for training students in an interdisciplinary area, and will result in the development of new courses focused on evolutionary analysis of biological networks. The interdisciplinary nature of the proposed work will help successfully recruit students to computer science from traditionally under-represented groups. The project methodologies will be implemented in software packages and made available through open-source mechanisms.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0845336
Program Officer
Vasant G. Honavar
Project Start
Project End
Budget Start
2009-07-15
Budget End
2014-06-30
Support Year
Fiscal Year
2008
Total Cost
$500,000
Indirect Cost
Name
Rice University
Department
Type
DUNS #
City
Houston
State
TX
Country
United States
Zip Code
77005