Chromatin plays an essential role in transcriptional regulation. Chromatin-related genes are frequently mutated in cancers. Dissecting the functions of chromatin in gene regulation is important for understanding the molecular mechanisms of oncogenesis and tumor progression. As an experienced computational biologist with expertise on ChIP-seq bioinformatics and epigenetics, my research has focused on developing computational methodologies for high-throughput genomic data analysis and computational modeling on chromatin regulation of gene expression. With more independent research training in cancer biology, I will develop my research program on computational cancer epigenetics and develop an independent academic career. Recent studies have demonstrated the feasibility of targeting chromatin regulators for active open regions in the genome as novel therapeutics for cancer treatment. However, the context-specific substrates of chromatin regulators and the mechanisms underlying how chromatin regulates gene expression are largely unclear.. With the advent of next-generation sequencing based high-throughput genomic techniques including ChIP-seq, DNase-seq, and ATAC-seq, a large amount of for genomic profiling data became available, making it possible to systematically decipher the gene regulatory mechanisms with an integrative computational approach. The objective of this project is to develop novel quantitative and computational methodologies for studying epigenetic gene regulation and the functions of chromatin regulators in cancer. Specifically, we propose to develop integrative computational methods that leverage the abundant public ChIP-seq, DNase-seq, and ATAC-seq data for predicting functional regulatory elements and TFs.
First (Aim 1), we will develop a method that predicts the functional enhancer elements and associated TFs given any gene set using public histone mark ChIP-seq data across multiple cell types.
Second (Aim 2), we will develop a quantitative model to identify the nucleotide-resolution chromatin accessibility dynamics from paired-end DNase-seq or ATAC-seq data with correction of intrinsic biases in the data. Finally (Aim 3), we will integrate publicly available DNase- seq, ATAC-seq, and ChIP-seq data in a comprehensive database and systematically characterize the functions of chromatin regulators with a focus on EZH2 in a few cancer systems, including castration-resistant prostate cancer (CRPC) cells, and malignant peripheral nerve sheath tumors (MPNSTs). These computational methods complement existing bioinformatics methodologies and will have broad applications in the study of cancer epigenetics and gene regulation. The proposed research will fill the knowledge gap between oncogenic drivers and downstream gene expression program, and could provide mechanistic support for development of novel targeted therapeutics for cancer precision medicine.

Public Health Relevance

Cancer is a disease of the genome. Chromatin regulators are a group of protein that play key roles in gene regulation in the genome and are frequently mutated in many cancers. The proposed research is to develop comprehensive quantitative models and computational methods for studying chromatin regulation of gene expression in the genome. It will provide useful bioinformatics algorithms and data resources to the community in the research field. The anticipated results from applying these computational methods can provide mechanistic insights for developing novel therapeutics for cancer precision medicine.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Career Transition Award (K22)
Project #
Application #
Study Section
Subcommittee I - Transistion to Independence (NCI)
Program Officer
Korczak, Jeannette F
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Virginia
Public Health & Prev Medicine
Schools of Medicine
United States
Zip Code
Jose, Cynthia C; Jagannathan, Lakshmanan; Tanwar, Vinay S et al. (2018) Nickel exposure induces persistent mesenchymal phenotype in human lung epithelial cells through epigenetic activation of ZEB1. Mol Carcinog 57:794-806
Martins, André L; Walavalkar, Ninad M; Anderson, Warren D et al. (2018) Universal correction of enzymatic sequence bias reveals molecular signatures of protein/DNA interactions. Nucleic Acids Res 46:e9
Wang, Zhenjia; Civelek, Mete; Miller, Clint L et al. (2018) BART: a transcription factor prediction tool with query gene sets or epigenomic profiles. Bioinformatics 34:2867-2869
Mei, Shenglin; Meyer, Clifford A; Zheng, Rongbin et al. (2017) Cistrome Cancer: A Web Resource for Integrative Gene Regulation Modeling in Cancer. Cancer Res 77:e19-e22