The primary objective of this project is the development and the application of computational methods for analysis and interpretation of large-scale gene expression data. The use of high-density DNA arrays to monitor gene expression at a genome- wide scale constitutes a fundamental advance in biology. In particular, the dynamic expression patterns of all genes in the entire genome of an organism can be interrogated using sequential microarray hybridization of cDNA libraries. It is well known that complex gene expression patterns result from dynamic interacting networks of genes in the genetic regulatory circuitry. Hierarchical and modular organization of regulatory DNA sequence elements is important for combinatorial control and regulation of gene expression. In order to meet the challenge of interpretation of massive gene expression data that are currently being generated as a result of the availability of complete genome sequences and DNA chips, we propose to develop computational methods and to test them in actual analyses of such large-scale experimental data. This proposal is concerned with two specific aims: namely, (1) developing promoter databases and flexible computational tools that can efficiently facilitate transcriptome analysis; (2) solving real biological problems by collaborating with leading bench-scientists on studies of regulatory cis-elements and the mechanism of transcriptional regulation in specific biological systems and/or processes. We believe these two aims are inseparable, without closely joining forces between computational and experimental biologists, our ability of attacking complex biological problems will be severely limited. This proposal is likely to contribute to our understanding of the two related circuitry: one that is hard coded as a genetic program in the gene promoter architecture and the other that is wired in space-time as a linked network by interacting gene transcripts and products, both can be activated or manifested in response to specific signals. In turn, the results will help us to understand normal biological processes such as cell cycle, homeostasis, growth and development as well as pathological manifestations such as metabolic diseases, developmental abnormalities and cancer.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM060513-03
Application #
6526200
Study Section
Special Emphasis Panel (ZRG1-SSS-Y (02))
Program Officer
Anderson, James J
Project Start
2000-08-01
Project End
2004-07-31
Budget Start
2002-08-01
Budget End
2004-07-31
Support Year
3
Fiscal Year
2002
Total Cost
$298,800
Indirect Cost
Name
Cold Spring Harbor Laboratory
Department
Type
DUNS #
065968786
City
Cold Spring Harbor
State
NY
Country
United States
Zip Code
11724
Das, Debopriya; Zhang, Michael Q (2007) Predictive models of gene regulation: application of regression methods to microarray data. Methods Mol Biol 377:95-110
Smith, Andrew D; Sumazin, Pavel; Xuan, Zhenyu et al. (2006) DNA motifs in human and mouse proximal promoters predict tissue-specific expression. Proc Natl Acad Sci U S A 103:6275-80
Xuan, Zhenyu; Zhao, Fang; Wang, Jinhua et al. (2005) Genome-wide promoter extraction and analysis in human, mouse, and rat. Genome Biol 6:R72
Sumazin, Pavel; Chen, Gengxin; Hata, Naoya et al. (2005) DWE: discriminating word enumerator. Bioinformatics 21:31-8
Smith, Andrew D; Sumazin, Pavel; Das, Debopriya et al. (2005) Mining ChIP-chip data for transcription factor and cofactor binding sites. Bioinformatics 21 Suppl 1:i403-12
Smith, Andrew D; Sumazin, Pavel; Zhang, Michael Q (2005) Identifying tissue-selective transcription factor binding sites in vertebrate promoters. Proc Natl Acad Sci U S A 102:1560-5
Chen, Gengxin; Hata, Naoya; Zhang, Michael Q (2004) Transcription factor binding element detection using functional clustering of mutant expression data. Nucleic Acids Res 32:2362-71
Das, Debopriya; Banerjee, Nilanjana; Zhang, Michael Q (2004) Interacting models of cooperative gene regulation. Proc Natl Acad Sci U S A 101:16234-9
Kato, Mamoru; Hata, Naoya; Banerjee, Nilanjana et al. (2004) Identifying combinatorial regulation of transcription factors and binding motifs. Genome Biol 5:R56
Zhang, M Q (2003) Prediction, annotation, and analysis of human promoters. Cold Spring Harb Symp Quant Biol 68:217-25

Showing the most recent 10 out of 14 publications