Gene expression in all organisms is controlled by large protein-nucleic-acid assemblies called ?cis-regulatory complexes.? From transcription in bacteria to mRNA splicing in humans, cis-regulatory complexes act as molecular computers, tuning gene expression in response to information in the cellular environment. A mechanistic understanding of how these complexes function will have a major impact on basic science, synthetic biology, and human disease. This level of understanding requires biophysical models that quantitatively account for the protein-DNA, protein-RNA, and protein-protein interactions that occur within each cis-regulatory complex. Such models have been established for a handful of intensively studied systems, such as the lac promoter of Escherichia coli. However, the experiments used to establish these models require quantitative control over the in vivo concentrations of regulatory proteins, a requirement that is very hard to meet in less-well- understood contexts. In the coming years, my lab will pursue an alternative approach to deciphering biophysical models of cis-regulatory complexes in living cells. This innovative approach is highly scalable and applicable to a wide variety of biological systems. Our experiments will leverage massively parallel reporter assays performed on synthetic regulatory sequences that are designed to probe specific macromolecular interactions. These data will be used to decipher expression manifolds, mathematical objects whose inference bypasses the need to experimentally control in vivo protein concentrations. This program thus combines my training in theoretical physics and my extensive experience using high-throughput DNA sequencing to measure biophysical quantities. To emphasize the full generality of this approach, I am proposing work in two diverse biological contexts: transcriptional regulation in E. coli (Project 1) and alternative mRNA splicing in human cells (Project 2). Project 1a will establish the capabilities and limitations of this approach in a well-understood bacterial system, while Project 1b will extend this approach to bacterial promoters about which little is yet known. Project 2a will develop a biophysical model for the integration of information encoded within 5? and 3? splice sites during exon definition. Project 2b will use biophysical modeling to better understand and guide improvements in antisense oligo treatments that correct splicing defects in human disease. Project 2 is not predicated on Project 1, but the strategies developed in our studies of bacterial transcription will inform and improve our studies of splicing in humans. This research program will thus establish a new approach for dissecting cis-regulatory complexes in a wide range of biological systems. It will also yield specific biophysical models that can be immediately and broadly applied to problems in synthetic biology, to the prediction of pathogenic genetic variants, and to the design of molecular therapeutics.

Public Health Relevance

Gene expression is controlled by large molecular machines called ?cis-regulatory complexes.? This application proposes a combination of mathematical modeling and high-throughput experiments to study how cis-regulatory complexes work inside of living cells. This method has direct applications to fundamental problems in basic biological science, as well as to the understanding and treatment of multiple human diseases.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Unknown (R35)
Project #
1R35GM133777-01
Application #
9798232
Study Section
Special Emphasis Panel (ZGM1)
Program Officer
Resat, Haluk
Project Start
2019-09-01
Project End
2024-08-31
Budget Start
2019-09-01
Budget End
2020-08-31
Support Year
1
Fiscal Year
2019
Total Cost
Indirect Cost
Name
Cold Spring Harbor Laboratory
Department
Type
DUNS #
065968786
City
Cold Spring Harbor
State
NY
Country
United States
Zip Code
11724