The long-term goal is to understand how human gene transcription is controlled and regulated. The hypothesis is that such an understanding may be achieved by developing mathematical models that are predictive of promoter position and tissue-specific activity by using local genetic and epigenetic information. Recently, large- scale experimental technologies have mapped a great number of active promoters in a genome and while powerful, their rates of false positives (due to aberrant, likely nonfunctional mRNA transcripts), false negatives (due to incomplete sampling of tissues and developmental stages), and other errors (due to protocol biases) remain uncertain. Consequently, it is important to have additional approaches that incorporate more comprehensive or stringent criteria, and to examine sequence characteristics that, in addition to illuminating molecular mechanisms, may permit computational prediction and direct experimental detection of additional promoters. Even when all human promoters are mapped, merely documenting their positions will not tell us how they are recognized and deployed for transcription. Therefore, as more experimental mapping data become available, the more essential it becomes to develop mathematical models to understand promoter architecture, function and evolution. Now with the complete sequencing of the human genome and localization of almost all of the protein coding genes, understanding how each of these genes are controlled and regulated has become a major challenge in the genome research. Since a gene can often produce multiple transcripts through alternative promoter usage in different cells, at different developmental stages and/or in response to different signals, understanding key elements that define and regulate alternative promoters will be a crucial task before more comprehensive gene regulation networks can be constructed Powered by the ENCODE project, new high throughput genomics technologies for attacking such problems are being developed at a rapid pace. Advanced computational approaches coupled with experimental validations are essential for the ultimate understanding of the regulatory mechanisms of gene expression. The new specific aims are: A1. Extract, compare and classify tissue-specific promoters in mammals so that they may be grouped into different (not necessarily mutually exclusive) expressional and/or epigenetical classes; A2. Identify cis-regulatory motifs/modules as promoter architecture features and their relation to tissue-specific chromatin and expression patterns; A3. Build mathematical models for tissue-specific promoter and expression predictions; A4. Conduct case studies in real regulation pathways in selected tissues. The proposed research will combine experimental and computational approaches and technologies in order to better understand mammalian promoters in terms of genetic and epigenetic cis-regulatory codes. Such models are likely to offer new insights into mechanisms of gene regulation or mis-regulation, and will generate many hypotheses for further functional studies on global regulation of gene expression. ? ? ?

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Good, Peter J
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Cold Spring Harbor Laboratory
Cold Spring Harbor
United States
Zip Code
Zhang, Yu-An; Ma, Xiaotu; Sathe, Adwait et al. (2016) Validation of SCT Methylation as a Hallmark Biomarker for Lung Cancers. J Thorac Oncol 11:346-360
Chen, Yong; Wang, Yunfei; Xuan, Zhenyu et al. (2016) De novo deciphering three-dimensional chromatin interaction and topological domains by wavelet transformation of epigenetic profiles. Nucleic Acids Res 44:e106
Akerman, Martin; Fregoso, Oliver I; Das, Shipra et al. (2015) Differential connectivity of splicing activators and repressors to the human spliceosome. Genome Biol 16:119
Hu, Long; Di, Chao; Kai, Mingxuan et al. (2015) A common set of distinct features that characterize noncoding RNAs across multiple species. Nucleic Acids Res 43:104-14
Guo, Ya; Xu, Quan; Canzio, Daniele et al. (2015) CRISPR Inversion of CTCF Sites Alters Genome Topology and Enhancer/Promoter Function. Cell 162:900-10
Li, Wangzhi; Wu, Jie; Kim, Sang-Yong et al. (2014) Chd5 orchestrates chromatin remodelling during sperm development. Nat Commun 5:3812
Kim, Myoungjoo V; Ouyang, Weiming; Liao, Will et al. (2014) Murine in vivo CD8+ T Cell Killing Assay. Bio Protoc 4:
Weyn-Vanhentenryck, Sebastien M; Mele, Aldo; Yan, Qinghong et al. (2014) HITS-CLIP and integrative modeling define the Rbfox splicing-regulatory network linked to brain development and autism. Cell Rep 6:1139-1152
Kim, Myoungjoo V; Ouyang, Weiming; Liao, Will et al. (2014) Murine In vitro Memory T Cell Differentiation. Bio Protoc 4:
Xie, Wei; Schultz, Matthew D; Lister, Ryan et al. (2013) Epigenomic analysis of multilineage differentiation of human embryonic stem cells. Cell 153:1134-48

Showing the most recent 10 out of 85 publications