Epigenetic regulation of gene expression through variation in DNA methylation plays a critical role in a range of biological processes including cellular differentiation, human disease and cancer. New methods for determining the fine structure of methylation patterns genome-wide have led to an explosion of research in this field. Methylation near the gene promoter is correlated with gene silencing, while unmethylated promoters are potentially active. Current computational methods classify genes as either methylated and silenced or unmethylated and potentially active based on a coarse calculation of CpG methylation state across a window of a few hundred base pairs around the transcription start site. The detailed spatial pattern of methylation across the promoter may be an important determinant of gene expression, but current computational methods ignore this valuable information. Recent work has found regions proximal to CpG island promoters, dubbed """"""""CpG island shores"""""""", whose methylation state correlates with transcription. CpG island shores are loosely defined, however, and this concept is difficult to apply in practice. While other methylation signatures are also likely to correlate with gene expression, a general framework to identify and study them has not been established. New computational tools for identification and correlation of detailed methylation patterns with gene expression are critical for advancing our understanding of the epigenetics of gene regulation. We propose to develop software tools to detect new methylation signatures at gene promoters that correlate with expression. This will be done within a formal framework so that these signatures can be used to determine differentially methylated genes in a variety of study designs. Taking advantage of the fact that methylation over short ranges are highly correlated, we will interpolate methylation data at individual CpG sites in a 10 kb window around the TSS to yield a methylation signature at each promoter that is independent of primary sequence features. We will then use a metric from topology, the discrete Frechet distance, to calculate the similarity between methylation signatures, and apply this metric to cluster signatures of similar type. We will then determine clusters with methylation signatures that correlate with expression. Clusters that contain both silenced and expressed genes will be examined to see if other local primary sequence features can be used to discriminate the genes based on expression. This approach will be used to detect methylation changes in case-control studies and in pairwise comparisons, such as needed for timecourse analysis or for the detection of different states of differentiation. The general framework developed here can be expanded in the future to examine methylation signatures at enhancers and gene bodies, as well as to histone marks or other genomic signals that carry detailed spatial information.

Public Health Relevance

DNA methylation, the addition of a methyl group to the cytosine in CpG dinucleotides, is an important epigenetic regulator of gene expression in cells. DNA methylation has been implicated in a large number of biological processes including cellular differentiation and cancer. We will develop software tools to analyze data from new techniques for mapping DNA methylation genome-wide to discover new methylation signatures that are associated with silenced and active genes. These will include tools to use these signatures to determine what genes are differentially regulated in different cell types and in human disease.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Exploratory/Developmental Grants (R21)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Washington University
Internal Medicine/Medicine
Schools of Medicine
Saint Louis
United States
Zip Code
Hiken, J F; McDonald, J I; Decker, K F et al. (2017) Epigenetic activation of the prostaglandin receptor EP4 promotes resistance to endocrine therapy for breast cancer. Oncogene 36:2319-2327
Schlosberg, Christopher E; VanderKraats, Nathan D; Edwards, John R (2017) Modeling complex patterns of differential DNA methylation that associate with gene expression changes. Nucleic Acids Res 45:5100-5111
McDonald, James I; Celik, Hamza; Rois, Lisa E et al. (2016) Reprogrammable CRISPR/Cas9-based system for inducing site-specific DNA methylation. Biol Open 5:866-74
Lund, Kirstin; Cole, John J; VanderKraats, Nathan D et al. (2014) DNMT inhibitors reverse a specific signature of aberrant promoter DNA methylation and associated gene silencing in AML. Genome Biol 15:406
Cruickshanks, Hazel A; McBryan, Tony; Nelson, David M et al. (2013) Senescent cells harbour features of the cancer epigenome. Nat Cell Biol 15:1495-506
Vanderkraats, Nathan D; Hiken, Jeffrey F; Decker, Keith F et al. (2013) Discovering high-resolution patterns of differential DNA methylation that correlate with gene expression changes. Nucleic Acids Res 41:6816-27
Tian, Fei; Zhan, Fei; VanderKraats, Nathan D et al. (2013) DNMT gene expression and methylome in Marek's disease resistant and susceptible chickens prior to and following infection by MDV. Epigenetics 8:431-44