The primary objective of this grant is to develop and evaluate methods for the statistical analysis of DNA methylation data, with the ultimate goal of understanding the joint behavior of DNA methylation with genotype, copy number variation, and gene expression. A wide variety of technologies are available for studying DNA methylation (see Laird 2010 for a review). We focus on statistical method development for two different platforms provided by Illumina, Inc. All our aims are motivated by ongoing studies at the University of Southern California Epigenome Center. Specifically, we propose the following:
Specific Aim 1 : To develop and evaluate preprocessing methods for Illumina's Infinium HumanMethylation BeadArrays using technical replicates and mixed samples. a. To develop a fast Gamma-Gamma convolution model to correct for background fluorescence, and compare it with state-of-the art methods;b. To extend background correction methods to stratify by GC content;c. To provide code for data preprocessing in Bioconductor.
Specific Aim 2 : To develop and evaluate statistical tools for exploring condition-specific variation in DNA methylation. a. To develop novel filters to select loci for cluster analysis that consider the outcome, proportion DNA methylation, to follow a Beta distribution with variance a function of the mean;b. To develop a method for differential methylation detection using spatial smoothing and the fused lasso.
Specific Aim 3 : To develop and evaluate methods for processing whole-genome bisulfite-seq data. a. To develop and evaluate a novel model-based SNP genotype caller for bisulfite sequence data. This tool will simultaneously extract DNA methylation content for downstream analysis;b. To calibrate our model for known biases in bisulfite conversion and sequencing errors using control data sets of in vitro methylated and unmethylated DNA (SSS.1-treated and WGA). We will apply the methods developed in Aims 1-3 to DNA methylation data generated at the USC Epigenome center in studies of cancer, neurological disorder, and autoimmune disease, and make user-friendly, open-source computational tools publicly available.

Public Health Relevance

In humans, epigenetic variation permits cells with identical genomes to specialize in function. This variation is an intermediate phenotype, affected by exposures, and predictive of disease and outcome. DNA methylation is the most commonly studied epigenetic mark, found to be aberrant in cancer, autoimmune and neurological disorders. We propose to develop statistical methods for the analysis of DNA methylation measured using new high-throughput microarray and sequencing technologies, for a better understanding of its role in human disease.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Southern California
Public Health & Prev Medicine
Schools of Medicine
Los Angeles
United States
Zip Code
Zhou, Wanding; Dinh, Huy Q; Ramjan, Zachary et al. (2018) DNA methylation loss in late-replicating domains is linked to mitotic cell division. Nat Genet 50:591-602
Liu, Jie; Liang, Gangning; Siegmund, Kimberly D et al. (2018) Data integration by multi-tuning parameter elastic net regression. BMC Bioinformatics 19:369
Lin, De-Chen; Dinh, Huy Q; Xie, Jian-Jun et al. (2018) Identification of distinct mutational patterns and new driver genes in oesophageal squamous cell carcinomas and adenocarcinomas. Gut 67:1769-1779
Chopra, Sameer; Liu, Jie; Alemozaffar, Mehrdad et al. (2017) Improving needle biopsy accuracy in small renal mass using tumor-specific DNA methylation markers. Oncotarget 8:5439-5448
Liu, Jie; Siegmund, Kimberly D (2016) An evaluation of processing methods for HumanMethylation450 BeadChip data. BMC Genomics 17:469
Yao, Lijing; Shen, Hui; Laird, Peter W et al. (2015) Inferring regulatory element landscapes and transcription factor networks from cancer methylomes. Genome Biol 16:105
Weisenberger, Daniel J; Levine, A Joan; Long, Tiffany I et al. (2015) Association of the colorectal CpG island methylator phenotype with molecular features, risk factors, and family history. Cancer Epidemiol Biomarkers Prev 24:512-519
Reizel, Yitzhak; Spiro, Adam; Sabag, Ofra et al. (2015) Gender-specific postnatal demethylation and establishment of epigenetic memory. Genes Dev 29:923-33
Wu, Dai-Ying; Bittencourt, Danielle; Stallcup, Michael R et al. (2015) Identifying differential transcription factor binding in ChIP-seq. Front Genet 6:169
Wang, Xinhui; Laird, Peter W; Hinoue, Toshinori et al. (2014) Non-specific filtering of beta-distributed data. BMC Bioinformatics 15:199

Showing the most recent 10 out of 11 publications