Gene regulatory networks are defined by highly specific interactions between thousands of unique molecules. Transcription factors (TFs) play a central role in these networks, but much remains unknown regarding the structural basis of their sequence specificity and the connectivity between signaling pathways and TFs. We will develop novel computational methods to address these fundamental questions. We will also analyze post-transcriptional regulation of transcript stability by RNA-binding proteins. Most of our research effort will focus on yeast, but our methods will be applicable in all eukaryotes. For data access and experimental validation of our results, we will work with excellent high-throughput experimental collaborators. We will also perform more traditional follow- up experiments within our own laboratory. Our first specific aim is to infer a structure- based protein-DNA recognition code from high-throughput binding data. By performing a simultaneous fit to in vitro binding data for a wide range of TFs, we will estimate free energy potentials for base-pair/amino-acid recognition. These will allow us to predict sequence specificity from the amino-acid sequence of the TF alone and design TFs with prescribed sequence specificity.
Our second aim i s to identify modulators of TF activity using network-level genetic linkage analysis. We will develop a method that combines the power of genetic linkage analysis with prior information about transcriptional network connectivity, and identify quantitative trait loci whose allelic status affects TF activity. Using this approach, we will perform a comprehensive analysis of the connectivity between the signaling and the transcriptional networks in yeast.
Our third aim i s to functionally dissect post-transcriptional regulation of mRNA stability. We previously demonstrated that steady-state mRNA expression data contains detailed information about the condition-specific control of mRNA half-life by RNA-binding proteins (RBPs). By integrating a novel high-throughput immunoprecipitation dataset for >40 RBPs with genome wide mRNA expression data for a large number of physiological conditions, we will predict the conditions in which specific RBPs are active. We will analyze combinatorial cis-regulatory interactions with co-factors and use linkage analysis to map connectivity between signaling pathways and post-transcriptional networks. Aberrant regulation of gene expression is often associated with disease. Furthermore, genetic differences between individuals affect responsiveness to drugs as well as disease prognosis. Our work will lead to theoretical and biological insights, as well as practical software tools and databases that will help basic and applied researchers to understand and predict the behavior of gene regulatory networks.

Public Health Relevance

This project aims to further develop computational algorithms and software that can be used to predict how DNA- and RNA-binding """"""""read"""""""" the genome sequence in order to control gene expression in a gene- and cell type-specific manner. These tools will allow researchers to understand how the behavior of gene regulatory networks is shaped by the genome sequence, and affected by genetics differences between individuals. Aberrant regulation of gene expression is often associated with disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG003008-06
Application #
7840450
Study Section
Special Emphasis Panel (ZRG1-GGG-F (02))
Program Officer
Good, Peter J
Project Start
2004-08-13
Project End
2013-04-30
Budget Start
2010-05-01
Budget End
2011-04-30
Support Year
6
Fiscal Year
2010
Total Cost
$384,632
Indirect Cost
Name
Columbia University (N.Y.)
Department
Biology
Type
Other Domestic Higher Education
DUNS #
049179401
City
New York
State
NY
Country
United States
Zip Code
10027
Kribelbauer, Judith F; Laptenko, Oleg; Chen, Siying et al. (2017) Quantitative Analysis of the DNA Methylation Sensitivity of Transcription Factor Complexes. Cell Rep 19:2383-2395
Li, Jinsen; Sagendorf, Jared M; Chiu, Tsu-Pei et al. (2017) Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding. Nucleic Acids Res 45:12877-12887
van Arensbergen, Joris; FitzPatrick, Vincent D; de Haas, Marcel et al. (2017) Genome-wide mapping of autonomous promoter activity in human cells. Nat Biotechnol 35:145-153
Sagendorf, Jared M; Berman, Helen M; Rohs, Remo (2017) DNAproDB: an interactive tool for structural analysis of DNA-protein complexes. Nucleic Acids Res :
Bussemaker, Harmen J; Causton, Helen C; Fazlollahi, Mina et al. (2017) Network-based approaches that exploit inferred transcription factor activity to analyze the impact of genetic variation on gene expression. Curr Opin Syst Biol 2:98-102
Chiu, Tsu-Pei; Rao, Satyanarayan; Mann, Richard S et al. (2017) Genome-wide prediction of minor-groove electrostatic potential enables biophysical modeling of protein-DNA binding. Nucleic Acids Res 45:12565-12576
Fazlollahi, Mina; Muroff, Ivor; Lee, Eunjee et al. (2016) Identifying genetic modulators of the connectivity between transcription factors and their transcriptional targets. Proc Natl Acad Sci U S A 113:E1835-43
Chiu, Tsu-Pei; Comoglio, Federico; Zhou, Tianyin et al. (2016) DNAshapeR: an R/Bioconductor package for DNA shape prediction and feature encoding. Bioinformatics 32:1211-3
Bell, Robert J A; Rube, H Tomas; Xavier-Magalhães, Ana et al. (2016) Understanding TERT Promoter Mutations: A Common Path to Immortality. Mol Cancer Res 14:315-23
Zhou, Tianyin; Shen, Ning; Yang, Lin et al. (2015) Quantitative modeling of transcription factor binding specificities using DNA shape. Proc Natl Acad Sci U S A 112:4654-9

Showing the most recent 10 out of 51 publications