Microarray gene expression profiling is performed in many laboratories, resulting in the rapid data accumulation in public repositories. However, due to the existence of different technology platforms and the lack of standard experimental protocols, systematic variation among data sets often exceeds the capability of statistical normalization. Currently, there is an urgent need for methodology to integrate cross-platform microarray data. This proposal addresses this need.
We aim at developing novel computational and statistical methods to integrate cross-platform microarray data. Specifically, we will (1) detect recurrent expression patterns across many microarray datasets;(2) perform functional and transcriptional annotation for multiple genomes;(3) predict transcription regulators for higher eukaryotic genes without prior information on protein-DNA binding sites;and (4) identify genetic networks that are signatures of diseases. Using our approach, we are in a position to extract an order of magnitude more information for any genome for which massive microarray data is available. We will perform """"""""context-specific"""""""" functional and transcriptional annotation for the genomes of yeast (S. cerevisiae), worm (C. elegans), fly (D. melanogaster), plant (A. thaliana), mouse (M. musculus), rat (R. norvegicus) and human (H. sapiens). That is, we will conditionally annotate the functions/regulations of genes, depending on which set of other genes they are interacting with and under which sets of conditions such interactions occur. When releasing our prediction results, we will attach to each annotation the necessary context information. Finally, we will develop a software package ARRAYMINEfor biologists to perform integrative analysis of cross-platform microarray data. Our algorithms and software will significantly facilitate the re-use of the vast amount of existing microarray data, reduce the necessity to generate new data, and improve our understanding of cellular functions and networks under a variety of perturbations.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM074163-05
Application #
7753856
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Remington, Karin A
Project Start
2006-01-01
Project End
2011-06-30
Budget Start
2010-01-01
Budget End
2011-06-30
Support Year
5
Fiscal Year
2010
Total Cost
$188,696
Indirect Cost
Name
University of Southern California
Department
Biology
Type
Schools of Arts and Sciences
DUNS #
072933393
City
Los Angeles
State
CA
Country
United States
Zip Code
90089
Dai, Chao; Li, Wenyuan; Liu, Juan et al. (2012) Integrating many co-splicing networks to reconstruct splicing regulatory modules. BMC Syst Biol 6 Suppl 1:S17
Li, Wenyuan; Zhang, Shihua; Liu, Chun-Chi et al. (2012) Identifying multi-layer gene regulatory modules from multi-dimensional genomic data. Bioinformatics 28:2458-66
Li, Wenyuan; Dai, Chao; Liu, Chun-Chi et al. (2012) Algorithm to identify frequent coupled modules from two-layered network series: application to study transcription and splicing coupling. J Comput Biol 19:710-30
Zhang, Shihua; Liu, Chun-Chi; Li, Wenyuan et al. (2012) Discovery of multi-dimensional modules by integrative analysis of cancer genomic data. Nucleic Acids Res 40:9379-91
Zhang, Shihua; Li, Qingjiao; Liu, Juan et al. (2011) A novel computational framework for simultaneous integration of multiple types of genomic data to identify microRNA-gene regulatory modules. Bioinformatics 27:i401-9
Li, Wenyuan; Liu, Chun-Chi; Zhang, Tong et al. (2011) Integrative analysis of many weighted co-expression networks using tensor computation. PLoS Comput Biol 7:e1001106
Mehan, Michael R; Nunez-Iglesias, Juan; Dai, Chao et al. (2010) An integrative modular approach to systematically predict gene-phenotype associations. BMC Bioinformatics 11 Suppl 1:S62
Huang, Haiyan; Liu, Chun-Chi; Zhou, Xianghong Jasmine (2010) Bayesian approach to transforming public gene expression repositories into disease diagnosis databases. Proc Natl Acad Sci U S A 107:6823-8
Li, Wenyuan; Xu, Min; Zhou, Xianghong Jasmine (2010) Unraveling complex temporal associations in cellular systems across multiple time-series microarray datasets. J Biomed Inform 43:550-9
Nunez-Iglesias, Juan; Liu, Chun-Chi; Morgan, Todd E et al. (2010) Joint genome-wide profiling of miRNA and mRNA expression in Alzheimer's disease cortex reveals altered miRNA regulation. PLoS One 5:e8898

Showing the most recent 10 out of 22 publications