Epigenetic patterns provide an extra layer of gene regulation beyond the genomic sequence and play a critical role in the maintenance of cell-type specific gene expression programs. Disruption of epigenetic regulation can cause severe diseases including cancer. During the past few years, a large amount of epigenomic data have been generated displaying significant epigenetic pattern changes across cell- types and in disease tissues. On the other hand, the targeting mechanism for epigenetic factors remains poorly understood. The investigators propose integrated computational and experimental approaches to tackle this challenge, summarized by the following three specific aims.
In Aim 1, a wavelet analysis-based computational approach will be developed to predict genome-wide epigenetic patterns using DNA sequence information.
In Aim 2, gene expression data will be further integrated to predict tissue-specific epigenetic changes.
In Aim 3, controlled experiments will be carried out in mouse embryonic stem cells to validate computational predictions. If successful, our proposed research will provide mechanistic insights into epigenetic targeting and can be used to develop tools to reverse specific disease-causing, aberrant epigenetic changes as a potential novel therapeutic approach for cancer and other diseases. Our preliminary studies suggest that a wavelet-based computational approach, previously developed by the principal investigator and colleagues, is effective for the prediction of a number of epigenetic modifications. This approach will be optimized in Aim 1 for de novo detection of sequence features associated with various epigenetic modifications by incorporating more effective wavelet tools such as thresholding and wavelet packet analysis. The strength will be further enhanced by incorporating previously annotated sequence features and by using a more effective classification method.
For Aim 2, gene expression data will be combined with sequence analysis to predict tissue-specific epigenetic changes. A sparse principal component regression method will be used to identify the tissue-specific, context dependent regulatory effects of modules each characterized by a combinatorial pattern of multiple regulators.
For Aim 3, we will select DNA sequences and regulatory modules based on computational predictions and experimentally test their roles in epigenetic targeting by using a number of assays including introducing genetic mutations, forcing or inhibiting gene expression levels, and chromatin immunoprecipitation.

Public Health Relevance

The proposed research will identify key regulatory factors responsible for aberrant epigenetic profiles in cancer and other diseases. It will also help identify novel epigenetic therapeutic targets.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Pazin, Michael J
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Dana-Farber Cancer Institute
United States
Zip Code
Wu, Gengze; Cai, Jin; Han, Yu et al. (2014) LincRNA-p21 regulates neointima formation, vascular smooth muscle cell proliferation, apoptosis, and atherosclerosis by enhancing p53 activity. Circulation 130:1452-65
Pinello, Luca; Xu, Jian; Orkin, Stuart H et al. (2014) Analysis of chromatin-state plasticity identifies cell-type-specific regulators of H3K27me3 patterns. Proc Natl Acad Sci U S A 111:E344-53
Pinello, Luca; Lo Bosco, Giosue; Yuan, Guo-Cheng (2014) Applications of alignment-free methods in epigenomics. Brief Bioinform 15:419-30
Meyer, Pablo; Siwo, Geoffrey; Zeevi, Danny et al. (2013) Inferring gene expression from ribosomal promoter sequences, a crowdsourcing approach. Genome Res 23:1928-37
Pulakanti, Kirthi; Pinello, Luca; Stelloh, Cary et al. (2013) Enhancer transcribed RNAs arise from hypomethylated, Tet-occupied genomic regions. Epigenetics 8:1303-20
Bauer, Daniel E; Kamran, Sophia C; Lessard, Samuel et al. (2013) An erythroid enhancer of BCL11A subject to genetic variation determines fetal hemoglobin level. Science 342:253-7
Shao, Zhen; Zhang, Yijing; Yuan, Guo-Cheng et al. (2012) MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets. Genome Biol 13:R16
Yuan, Guo-Cheng (2012) Linking genome to epigenome. Wiley Interdiscip Rev Syst Biol Med 4:297-309
Larson, Jessica L; Yuan, Guo-Cheng (2012) Chromatin states accurately classify cell differentiation stages. PLoS One 7:e31414
Van Rechem, Capucine; Black, Joshua C; Abbas, Tarek et al. (2011) The SKP1-Cul1-F-box and leucine-rich repeat protein 4 (SCF-FbxL4) ubiquitin ligase regulates lysine demethylase 4A (KDM4A)/Jumonji domain-containing 2A (JMJD2A) protein. J Biol Chem 286:30462-70