Gene regulatory networks are defined by highly specific interactions between thousands of unique molecules. Transcription factors (TFs) play a central role in these networks, but much remains unknown regarding the structural basis of their sequence specificity and the connectivity between signaling pathways and TFs. We will develop novel computational methods to address these fundamental questions. We will also analyze post-transcriptional regulation of transcript stability by RNA-binding proteins. Most of our research effort will focus on yeast, but our methods will be applicable in all eukaryotes. For data access and experimental validation of our results, we will work with excellent high-throughput experimental collaborators. We will also perform more traditional follow- up experiments within our own laboratory. Our first specific aim is to infer a structure- based protein-DNA recognition code from high-throughput binding data. By performing a simultaneous fit to in vitro binding data for a wide range of TFs, we will estimate free energy potentials for base-pair/amino-acid recognition. These will allow us to predict sequence specificity from the amino-acid sequence of the TF alone and design TFs with prescribed sequence specificity.
Our second aim i s to identify modulators of TF activity using network-level genetic linkage analysis. We will develop a method that combines the power of genetic linkage analysis with prior information about transcriptional network connectivity, and identify quantitative trait loci whose allelic status affects TF activity. Using this approach, we will perform a comprehensive analysis of the connectivity between the signaling and the transcriptional networks in yeast.
Our third aim i s to functionally dissect post-transcriptional regulation of mRNA stability. We previously demonstrated that steady-state mRNA expression data contains detailed information about the condition-specific control of mRNA half-life by RNA-binding proteins (RBPs). By integrating a novel high-throughput immunoprecipitation dataset for >40 RBPs with genomewide mRNA expression data for a large number of physiological conditions, we will predict the conditions in which specific RBPs are active. We will analyze combinatorial cis-regulatory interactions with co-factors and use linkage analysis to map connectivity between signaling pathways and post-transcriptional networks. Aberrant regulation of gene expression is often associated with disease. Furthermore, genetic differences between individuals affect responsiveness to drugs as well as disease prognosis. Our work will lead to theoretical and biological insights, as well as practical software tools and databases that will help basic and applied researchers to understand and predict the behavior of gene regulatory networks.

Public Health Relevance

This project aims to further develop computational algorithms and software that can be used to predict how DNA- and RNA-binding read the genome sequence in order to control gene expression in a gene- and cell type-specific manner. These tools will allow researchers to understand how the behavior of gene regulatory networks is shaped by the genome sequence, and affected by genetics differences between individuals. Aberrant regulation of gene expression is often associated with disease.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-GGG-F (02))
Program Officer
Pazin, Michael J
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Columbia University (N.Y.)
Other Domestic Higher Education
New York
United States
Zip Code
Chiu, Tsu-Pei; Yang, Lin; Zhou, Tianyin et al. (2015) GBshape: a genome browser database for DNA shape annotations. Nucleic Acids Res 43:D103-9
Dantas Machado, Ana Carolina; Zhou, Tianyin; Rao, Satyanarayan et al. (2015) Evolving insights on how cytosine methylation affects protein-DNA binding. Brief Funct Genomics 14:61-73
Slattery, Matthew; Zhou, Tianyin; Yang, Lin et al. (2014) Absence of a simple code: how transcription factors read the genome. Trends Biochem Sci 39:381-99
Riley, Todd R; Slattery, Matthew; Abe, Namiko et al. (2014) SELEX-seq: a method for characterizing the complete repertoire of binding site preferences for transcription factor complexes. Methods Mol Biol 1196:255-78
Ghosh, Hiyaa S; Ceribelli, Michele; Matos, Ines et al. (2014) ETO family protein Mtg16 regulates the balance of dendritic cell subsets by repressing Id2. J Exp Med 211:1623-35
Zhang, Xiaojun; Dantas Machado, Ana Carolina; Ding, Yuan et al. (2014) Conformations of p53 response elements in solution deduced using site-directed spin labeling and Monte Carlo sampling. Nucleic Acids Res 42:2789-97
Lee, Eunjee; de Ridder, Jeroen; Kool, Jaap et al. (2014) Identifying regulatory mechanisms underlying tumorigenesis using locus expression signature analysis. Proc Natl Acad Sci U S A 111:5747-52
van Arensbergen, Joris; van Steensel, Bas; Bussemaker, Harmen J (2014) In search of the determinants of enhancer-promoter interaction specificity. Trends Cell Biol 24:695-702
Dror, Iris; Zhou, Tianyin; Mandel-Gutfreund, Yael et al. (2014) Covariation between homeodomain transcription factors and the shape of their DNA binding sites. Nucleic Acids Res 42:430-41
Yang, Lin; Zhou, Tianyin; Dror, Iris et al. (2014) TFBSshape: a motif database for DNA shape features of transcription factor binding sites. Nucleic Acids Res 42:D148-55

Showing the most recent 10 out of 32 publications