How do changes in DNA sequence impact organismal properties? This is a central question of modern biology, and insights into it can help us understand, among other things, why patients respond differently to the same treatment, or why some species exhibit behavioral properties not seen in other species. A major hurdle in solving this ?genotype-to-phenotype? problem is our poor knowledge of gene regulatory mechanisms underlying phenotypes and cellular processes, and how those mechanisms are encoded in DNA. It also leads to severe difficulties in prioritizing phenotype-linked non-coding variants (polymorphisms) for further investigation. Driven by these challenges, my lab seeks to develop quantitative frameworks for describing and discovering transcriptional regulatory mechanisms. We have made significant progress towards this goal in two main directions: (1) We have developed detailed biophysical models of the cis-regulatory encoding of gene expression. Using these models we have shown how the regulatory function of transcription factor (TF) binding sites depends on their sequence and DNA shape, as well as their ?trans-context?, e.g., cellular concentrations of regulators, and ?cis-context?, e.g., proximity to other TF binding sites and chromatin states. (2) We have devised statistical models to discover TF-gene interactions from transcriptomic data, as well as other types of ?omics? data if available. Working closely with biologists, we have applied these models to understand phenotypes such as cytotoxic drug response in cell lines, behavioral response to social encounters, and embryonic development. Building on the strong foundations of our past work, I propose to establish a research program that studies transcriptional regulation holistically at the cis- and trans- levels. Our new pursuits will include: (1) use of our computational, sequence-level models to describe two data-rich mammalian regulatory programs, an experimental collaboration to dissect the cis-regulatory logic of a key inflammation gene using massively parallel reporter assays, and major advances in our modeling techniques; (2) new machine learning methods for reconstructing networks of TF-gene interactions that explain phenotypic differences, integration of cis- and trans-regulatory evidence from multi-omics data, and collaborations to apply these methods in cancer pharmacogenomics and behavioral neurogenomics; (3) a new probabilistic framework to combine traditional statistical scores of a non-coding variant with quantitative predictions of its regulatory impact based on the above-mentioned techniques. Explorations of new forms of synergy among these related goals of network reconstruction, cis-regulatory sequence modeling and variant interpretation will be woven throughout our research program.

Public Health Relevance

The proposed work will answer fundamental questions about molecular mechanisms underlying phenotypic differences between individuals, such as why different patients respond differently to the same drug. We will develop new computational methods for discovery of such mechanisms from a variety of genomics data sets. We will work closely with experimental biologists and use our tools to study important biological topics including cancer drug response, inflammation, and effect of estrogen on breast cancer.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Unknown (R35)
Project #
1R35GM131819-01
Application #
9699641
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Ravichandran, Veerasamy
Project Start
2019-09-20
Project End
2024-08-31
Budget Start
2019-09-20
Budget End
2020-08-31
Support Year
1
Fiscal Year
2019
Total Cost
Indirect Cost
Name
University of Illinois Urbana-Champaign
Department
Biostatistics & Other Math Sci
Type
Biomed Engr/Col Engr/Engr Sta
DUNS #
041544081
City
Champaign
State
IL
Country
United States
Zip Code
61820