High-throughput biotechnologies have generated a large number and variety of molecular networks, including protein interaction networks, gene coexpression networks, and regulatory networks. Network biology is an emerging field aiming to understand basic biological mechanisms and disease processes by using molecular networks. Therefore, computational and statistical tools are urgently needed to mine biological knowledge from multiple networks. However, few such computational algorithms are available, and almost no statistical methods have been developed for multiple network analysis. The investigators hypothesize 1) that efficient score functions for gene subnetworks can be defined so that high score correlates with biological significance, 2) that the statistical significance of biological networks are mathematically tractable, and 3) that efficient computational tools can be developed to find statistically significant patterns in biological networks. The objective of this application is to address these questions. In addition, the researchers will develop the software necessary to implement these programs. As a practical application, and to gain an understanding of molecular networks involved in aging, these algorithms will be implemented to analyze a large collection of aging-related gene expression datasets. The investigators will achieve all of these objectives through the following specific aims: 1) define novel scoring functions for network modules, taking both node degrees (the number of links of a node) and edge transitivity (the dependency between links forming triangles) into consideration; and develop efficient computational algorithms to identify molecular modules with high scores; 2) develop a rigorous theory to evaluate the statistical significance of the identified molecular modules; and 3) apply the fully developed tools to analyze a large collection of aging-related datasets and experimentally test a subset of the predictions in yeast. The large number of networks, their size, and their complexity, together make this an especially challenging project. The results from this research can be extremely useful for large scale network analysis, and therefore for the systematic understanding of biology.
Identifying genetic subnetworks related to diseases or drug treatments is an important challenging problem in biomedical research. The statistical and computational tools developed in this application for the analysis of multiple networks will be essential for the effort. The tools will be used to identify genetic networks specific to aging. ? ? ?