Gene regulatory networks are defined by highly specific interactions between thousands of unique molecules. Transcription factors (TFs) play a central role in these networks, but much remains unknown regarding the structural basis of their sequence specificity and the connectivity between signaling pathways and TFs. We will develop novel computational methods to address these fundamental questions. We will also analyze post-transcriptional regulation of transcript stability by RNA-binding proteins. Most of our research effort will focus on yeast, but our methods will be applicable in all eukaryotes. For data access and experimental validation of our results, we will work with excellent high-throughput experimental collaborators. We will also perform more traditional follow- up experiments within our own laboratory. Our first specific aim is to infer a structure- based protein-DNA recognition code from high-throughput binding data. By performing a simultaneous fit to in vitro binding data for a wide range of TFs, we will estimate free energy potentials for base-pair/amino-acid recognition. These will allow us to predict sequence specificity from the amino-acid sequence of the TF alone and design TFs with prescribed sequence specificity.
Our second aim i s to identify modulators of TF activity using network-level genetic linkage analysis. We will develop a method that combines the power of genetic linkage analysis with prior information about transcriptional network connectivity, and identify quantitative trait loci whose allelic status affects TF activity. Using this approach, we will perform a comprehensive analysis of the connectivity between the signaling and the transcriptional networks in yeast.
Our third aim i s to functionally dissect post-transcriptional regulation of mRNA stability. We previously demonstrated that steady-state mRNA expression data contains detailed information about the condition-specific control of mRNA half-life by RNA-binding proteins (RBPs). By integrating a novel high-throughput immunoprecipitation dataset for >40 RBPs with genomewide mRNA expression data for a large number of physiological conditions, we will predict the conditions in which specific RBPs are active. We will analyze combinatorial cis-regulatory interactions with co-factors and use linkage analysis to map connectivity between signaling pathways and post-transcriptional networks. Aberrant regulation of gene expression is often associated with disease. Furthermore, genetic differences between individuals affect responsiveness to drugs as well as disease prognosis. Our work will lead to theoretical and biological insights, as well as practical software tools and databases that will help basic and applied researchers to understand and predict the behavior of gene regulatory networks.

Public Health Relevance

This project aims to further develop computational algorithms and software that can be used to predict how DNA- and RNA-binding read the genome sequence in order to control gene expression in a gene- and cell type-specific manner. These tools will allow researchers to understand how the behavior of gene regulatory networks is shaped by the genome sequence, and affected by genetics differences between individuals. Aberrant regulation of gene expression is often associated with disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG003008-08
Application #
8274820
Study Section
Special Emphasis Panel (ZRG1-GGG-F (02))
Program Officer
Pazin, Michael J
Project Start
2004-08-13
Project End
2013-07-15
Budget Start
2012-05-01
Budget End
2013-07-15
Support Year
8
Fiscal Year
2012
Total Cost
$381,046
Indirect Cost
$136,021
Name
Columbia University (N.Y.)
Department
Biology
Type
Other Domestic Higher Education
DUNS #
049179401
City
New York
State
NY
Country
United States
Zip Code
10027
Fazlollahi, Mina; Muroff, Ivor; Lee, Eunjee et al. (2016) Identifying genetic modulators of the connectivity between transcription factors and their transcriptional targets. Proc Natl Acad Sci U S A 113:E1835-43
Chiu, Tsu-Pei; Comoglio, Federico; Zhou, Tianyin et al. (2016) DNAshapeR: an R/Bioconductor package for DNA shape prediction and feature encoding. Bioinformatics 32:1211-3
Bell, Robert J A; Rube, H Tomas; Xavier-Magalhães, Ana et al. (2016) Understanding TERT Promoter Mutations: A Common Path to Immortality. Mol Cancer Res 14:315-23
Zhou, Tianyin; Shen, Ning; Yang, Lin et al. (2015) Quantitative modeling of transcription factor binding specificities using DNA shape. Proc Natl Acad Sci U S A 112:4654-9
Lu, Xiang-Jun; Bussemaker, Harmen J; Olson, Wilma K (2015) DSSR: an integrated software tool for dissecting the spatial structure of RNA. Nucleic Acids Res 43:e142
Abe, Namiko; Dror, Iris; Yang, Lin et al. (2015) Deconvolving the recognition of DNA shape from sequence. Cell 161:307-18
Riley, Todd R; Lazarovici, Allan; Mann, Richard S et al. (2015) Building accurate sequence-to-affinity models from high-throughput in vitro protein-DNA binding data using FeatureREDUCE. Elife 4:
Dantas Machado, Ana Carolina; Zhou, Tianyin; Rao, Satyanarayan et al. (2015) Evolving insights on how cytosine methylation affects protein-DNA binding. Brief Funct Genomics 14:61-73
Chiu, Tsu-Pei; Yang, Lin; Zhou, Tianyin et al. (2015) GBshape: a genome browser database for DNA shape annotations. Nucleic Acids Res 43:D103-9
Bussemaker, Harmen J (2015) Recent progress in understanding transcription factor binding specificity. Brief Funct Genomics 14:1-2

Showing the most recent 10 out of 43 publications