My long-term research goal is to understand the organization and function of Cis-regulatory modules (CRMs) in the human genome, with a focus on their impact on development and disease. CRMs, such as promoters, enhancers, and insulators, are DNA elements that regulate gene expression. Genome-wide association studies (GWAS) have shown that most variants associated with a phenotype or disease are located outside of protein- coding regions and are postulated to affect gene expression levels through CRMs. Therefore, understanding the organization and function of CRMs is key to identifying the causes of genetic diseases and providing an essential backbone for precision medicine. Even though millions of putative CRMs have recently been identified with the help of high-throughput assays, it remains challenging to pinpoint functional CRMs that regulate tissue and developmental stage-specific transcription. In fact, a large proportion of the CRM variants identified so far have no-to-mild effects on the phenotype. As a result, those insights have very limited clinical application. Over the next five years, the goal of my research is to accurately identify causal CRM variants that affect normal blood cell development and impact childhood blood disorders. Several major hurdles must be overcome to achieve this goal. First, mounting evidence indicates that the expression fluctuation is an important trait for genes. Importantly, the tolerance of expression fluctuation varies among different genes. We reason that CRMs modulating transcription of highly expression-sensitive genes tend to be essential to cell function and harbor pathological non-coding variants. However, our understanding on expression-sensitive genes and their underlying biology is still rudimentary. Secondly, different epigenetic modification markers are routinely used to map potential CRMs. However, in many loci, those epigenetic markers are not required by CRM functions. Overreliance on associative, instead of causative, markers can confuse accurate identification of biologically important CRMs. Thirdly, while the genetic code of protein-coding sequences has been discovered for decades, the similar ?grammar? for non-coding sequences and CRMs in particular is still lacking. As a result, we are not able to predict how CRM variants affect their regulatory functions. Based on those challenges, we ask three fundamental questions: 1) How to systematically identify expression-sensitive genes? 2) How to decipher the causative mechanism of CRMs? 3) How can single-nucleotide variants (SNV) affect CRM functions? If successful, the proposed studies will identify functionally important CRMs controlling health-related traits and pinpoint pathological non-coding variants within those CRMs. Better understanding the anatomy and function of CRMs will facilitate precision medicine by allowing us to treat genetic diseases by manipulation of CRM function via gene editing or pharmacological approaches.

Public Health Relevance

Susceptibility to childhood genetic diseases and individual responses to therapies are influenced by genetic variation in DNA sequences that regulate the timing and amount of gene expression. Our research aims to accurately identify these regulatory DNA sequences and deepen our understanding of their roles in normal gene expression and human health. Our findings should lead to better understanding of complex health-related traits and generate novel avenues to prevent or treat genetic diseases.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Unknown (R35)
Project #
5R35GM133614-02
Application #
9997981
Study Section
Special Emphasis Panel (ZGM1)
Program Officer
Krasnewich, Donna M
Project Start
2019-09-01
Project End
2024-08-31
Budget Start
2020-09-01
Budget End
2021-08-31
Support Year
2
Fiscal Year
2020
Total Cost
Indirect Cost
Name
St. Jude Children's Research Hospital
Department
Type
DUNS #
067717892
City
Memphis
State
TN
Country
United States
Zip Code
38105