Software Tools For Regulatory Analysis of Large Cancer Methylome Datasets

Berman, Benjamin

Abstract

Massively high throughput DNA sequencing is quickly changing the study of gene regulation in cancer. Large- scale efforts such as the NIH-funded """"""""Encyclopedia of DNA Elements"""""""" (ENCODE) have exploited sequencing to map genome-wide chromatin features in human cancer cell lines using transformative technologies such as Chromatin Immunoprecipitation sequencing (ChIP-seq) and DNase I hypersensitivity sequencing (DHS-seq), and have made great strides toward a comprehensive database of gene regulatory elements in the human genome. The majority of cancer genomics projects focusing on patient samples use DNA methylation profiling, and we and others have shown that integration of these methylation profiles with ENCODE data can enable the identification of biologically-relevant epigenomic changes. However, the software tools required are not readily available to most cancer biologists. The reference maps themselves require a domain knowledge of gene regulatory features that is beyond the scope of many clinical research groups, and the publically available datasets are too often the result of heterogeneous and frequently shifting analysis pipelines. We will develop automated tools for unifying the various gene regulatory databases, and develop powerful yet user-friendly methylation workflows using the open-source R/BioConductor framework and our open-source, web-based Galaxy system. Standard workflows will use the methods we have developed for the TCGA project to import and analyze large numbers of raw methylation data files from either the Illumina Infinium or Bisulfite-seq platforms. We will also allow import of arbitrary sample metadata so users can perform two-way or multi-way comparisons between cancer subtypes or clinical covariates. Our workflows will be driven by the most current understanding of the chromatin landscape, which includes using histone modifications and DNase hypersensitivity data to define focal chromatin state, and Hi-C (nuclear conformation) and replication timing to define nuclear topological domains. Recent work by our lab and others suggests that methylation changes at cis-regulatory elements such as enhancers and insulators are driven primarily by binding of individual transcription factors, and thus reflect direct targeting of genes by specific transcriptional networks. We will use combined ChIP-seq and DNA binding motif analyses available from ENCODE to analyze user methylation data at the level of the individual protein/DNA interaction site. Finally, because the success of this effort will be measured by the degree of adoption within the cancer genomics community, we will engage several large- scale cancer genomics groups to act as beta testers and help us improve our workflows.

Public Health Relevance

Accumulating evidence suggests that cancer is often a disease driven by epigenetic defects. DNA methylation profiling is the most powerful technology for identifying epigenetic defects in patient populations, and the most exciting new discoveries have been made by incorporating data from massive public databases of gene regulation. Many innovative software tools have been developed for this purpose by our laboratories and others, but they are difficult or impossible to use for non-programmers. We will use this grant to extend and develop these tools into simple, web-based workflows aimed at clinical cancer researchers.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Research Project--Cooperative Agreements (U01)
Project #: 1U01CA184826-01
Application #: 8685796
Study Section: Special Emphasis Panel (ZCA1)
Program Officer: Li, Jerry

Project Start: 2014-05-13
Project End: 2017-04-30
Budget Start: 2014-05-13
Budget End: 2015-04-30
Support Year: 1
Fiscal Year: 2014
Total Cost
Indirect Cost

Institution

Name: University of Southern California
Department: Public Health & Prev Medicine
Type: Schools of Medicine
DUNS #

City: Los Angeles
State: CA
Country: United States
Zip Code: 90089

Related projects


NIH 2016 U01 CA	Software Tools For Regulatory Analysis of Large Cancer Methylome Datasets Berman, Benjamin P. / Cedars-Sinai Medical Center
NIH 2016 U01 CA	Software Tools For Regulatory Analysis of Large Cancer Methylome Datasets Berman, Benjamin P. / Cedars-Sinai Medical Center
NIH 2015 U01 CA	Software Tools For Regulatory Analysis of Large Cancer Methylome Datasets Berman, Benjamin P. / Cedars-Sinai Medical Center	$344,390
NIH 2014 U01 CA	Software Tools For Regulatory Analysis of Large Cancer Methylome Datasets Berman, Benjamin P. / University of Southern California
NIH 2014 U01 CA	Software Tools for Regulatory Analysis of Large Cancer Methylome Datasets Berman, Benjamin P. / Cedars-Sinai Medical Center	$293,717

Publications

Zhou, Wanding; Dinh, Huy Q; Ramjan, Zachary et al. (2018) DNA methylation loss in late-replicating domains is linked to mitotic cell division. Nat Genet 50:591-602

Hao, Jia-Jie; Lin, De-Chen; Dinh, Huy Q et al. (2016) Spatial intratumoral heterogeneity and temporal clonal evolution in esophageal squamous cell carcinoma. Nat Genet 48:1500-1507

Yao, Lijing; Shen, Hui; Laird, Peter W et al. (2015) Inferring regulatory element landscapes and transcription factor networks from cancer methylomes. Genome Biol 16:105

Yao, Lijing; Berman, Benjamin P; Farnham, Peggy J (2015) Demystifying the secret mission of enhancers: linking distal regulatory elements to target genes. Crit Rev Biochem Mol Biol 50:550-73

Lay, Fides D; Liu, Yaping; Kelly, Theresa K et al. (2015) The role of DNA methylation in directing the functional organization of the cancer epigenome. Genome Res 25:467-77

Coetzee, Simon G; Coetzee, Gerhard A; Hazelett, Dennis J (2015) motifbreakR: an R/Bioconductor package for predicting variant effects at transcription factor binding sites. Bioinformatics 31:3847-9

Comments

Be the first to comment on Benjamin Berman's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: