Computational and Statistical Studies for Multiple Molecular Networks

Sun, Fengzhu

Abstract

High-throughput biotechnologies have generated a large number and variety of molecular networks, including protein interaction networks, gene coexpression networks, and regulatory networks. Network biology is an emerging field aiming to understand basic biological mechanisms and disease processes by using molecular networks. Therefore, computational and statistical tools are urgently needed to mine biological knowledge from multiple networks. However, few such computational algorithms are available, and almost no statistical methods have been developed for multiple network analysis. The investigators hypothesize 1) that efficient score functions for gene subnetworks can be defined so that high score correlates with biological significance, 2) that the statistical significance of biological networks are mathematically tractable, and 3) that efficient computational tools can be developed to find statistically significant patterns in biological networks. The objective of this application is to address these questions. In addition, the researchers will develop the software necessary to implement these programs. As a practical application, and to gain an understanding of molecular networks involved in aging, these algorithms will be implemented to analyze a large collection of aging-related gene expression datasets. The investigators will achieve all of these objectives through the following specific aims: 1) define novel scoring functions for network modules, taking both node degrees (the number of links of a node) and edge transitivity (the dependency between links forming triangles) into consideration;and develop efficient computational algorithms to identify molecular modules with high scores;2) develop a rigorous theory to evaluate the statistical significance of the identified molecular modules;and 3) apply the fully developed tools to analyze a large collection of aging-related datasets and experimentally test a subset of the predictions in yeast. The large number of networks, their size, and their complexity, together make this an especially challenging project. The results from this research can be extremely useful for large scale network analysis, and therefore for the systematic understanding of biology.

Public Health Relevance

Identifying genetic subnetworks related to diseases or drug treatments is an important challenging problem in biomedical research. The statistical and computational tools developed in this application for the analysis of multiple networks will be essential for the effort. The tools will be used to identify genetic networks specific to aging.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute on Aging (NIA)
Type: Exploratory/Developmental Grants (R21)
Project #: 5R21AG032743-02
Application #: 7662378
Study Section: Biodata Management and Analysis Study Section (BDMA)
Program Officer: Murthy, Mahadev

Project Start: 2008-08-01
Project End: 2011-07-31
Budget Start: 2009-08-01
Budget End: 2011-07-31
Support Year: 2
Fiscal Year: 2009
Total Cost: $200,388
Indirect Cost

Institution

Name: University of Southern California
Department: Biology
Type: Schools of Arts and Sciences
DUNS #: 072933393

City: Los Angeles
State: CA
Country: United States
Zip Code: 90089

Related projects


NIH 2009 R21 AG	Computational and Statistical Studies for Multiple Molecular Networks Sun, Fengzhu / University of Southern California	$200,388
NIH 2008 R21 AG	Computational and Statistical Studies for Multiple Molecular Networks Sun, Fengzhu / University of Southern California	$167,075

Publications

Liu, Xuemei; Wan, Lin; Li, Jing et al. (2011) New powerful statistics for alignment-free sequence comparison under a pattern transfer model. J Theor Biol 284:106-16

Meng, Lu; Sun, Fengzhu; Zhang, Xuegong et al. (2011) Sequence alignment as hypothesis testing. J Comput Biol 18:677-91

Li, Wenyuan; Liu, Chun-Chi; Zhang, Tong et al. (2011) Integrative analysis of many weighted co-expression networks using tensor computation. PLoS Comput Biol 7:e1001106

Wan, Lin; Reinert, Gesine; Sun, Fengzhu et al. (2010) Alignment-free sequence comparison (II): theoretical power of comparison statistics. J Comput Biol 17:1467-90

Zhai, Zhiyuan; Ku, Shih-Yen; Luan, Yihui et al. (2010) The power of detecting enriched patterns: an HMM approach. J Comput Biol 17:581-92

Zhou, Linqi; Ma, Xiaotu; Arbeitman, Michelle N et al. (2009) Chromatin regulation and gene centrality are essential for controlling fitness pleiotropy in yeast. PLoS One 4:e8086

Wang, Wenhui; Nunez-Iglesias, Juan; Luan, Yihui et al. (2009) Usefulness and limitations of dK random graph models to predict interactions and functional homogeneity in biological networks under a pseudo-likelihood parameter estimation approach. BMC Bioinformatics 10:277

Reinert, Gesine; Chew, David; Sun, Fengzhu et al. (2009) Alignment-free sequence comparison (I): statistics and power. J Comput Biol 16:1615-34

Pape, Utz J; Rahmann, Sven; Sun, Fengzhu et al. (2008) Compound poisson approximation of the number of occurrences of a position frequency matrix (PFM) on both strands. J Comput Biol 15:547-64

Comments

Be the first to comment on Fengzhu Sun's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: