SVM-based Analysis of the Fine Scale Structure of Regulatory Elements

Beer, Michael

Abstract

The ENCODE projects have generated large high-quality functional genomic datasets which have the potential to dramatically impact our understanding of the specific mechanisms and general principles of the function of cell-specific regulatory elements. We propose to develop an SVM-based computational model to predict enhancers from these datsets and resolve their fine- scale structure. We will utilize an integrative approach to investigate these fine scale features which combines novel computational development, statistical analysis of ENCODE datasets, systematic scoring of human sequence variation, and high throughput validation to improve our understanding of how DNA sequence features and variation contribute to regulatory function. Based on our previous work using k-mer features to predict mammalian enhancers from genomic DNA sequence, we propose improvements in the treatment of sequence features which facilitate statistically robust estimation of long k-mer features and improved spatial resolution. This approach does not rely on previous biological knowledge, and uncovers the sets of novel TFs and cofactors which specify their cell-specific activity. We will train this model on ENCODE DNase-seq and ChIP-seq data and catalogue the regulatory elements in the available human cell-line and mouse datasets. In addition, this model makes specific predictions of the contributions of individual features to enhancer activity, so we propose to experimentally test this set of predictions by directly quantifying the impact of mutation of these elements in a luciferase reporter system. Finally we will evaluate and experimentally assess the predicted impact of specific human SNPs in a set of targeted cell lines. This project should contribute significantly toward a predictive model of regulatory element function and an understanding of how sequence variation impacts disease.

Public Health Relevance

We propose to develop SVM-based models to identify and predict the fine-scale structure of cell-specific regulatory elements from ENCODE datasets, assess the functional impact of human varation on the activity of these elements, and directly evaluate the predictions in experiments in the relevant human cell-lines. This project should significantly contribute to our understanding of regulatory element function and how sequence variation impacts disease.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project (R01)
Project #: 1R01HG007348-01
Application #: 8556758
Study Section: Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer: Feingold, Elise A

Project Start: 2013-09-13
Project End: 2018-06-30
Budget Start: 2013-09-13
Budget End: 2014-06-30
Support Year: 1
Fiscal Year: 2013
Total Cost: $460,570
Indirect Cost: $155,620

Institution

Name: Johns Hopkins University
Department: Biomedical Engineering
Type: Schools of Medicine
DUNS #: 001910777

City: Baltimore
State: MD
Country: United States
Zip Code: 21218

Related projects


NIH 2017 R01 HG	SVM-based Analysis of the Fine Scale Structure of Regulatory Elements Beer, Michael A. / Johns Hopkins University
NIH 2016 R01 HG	SVM-based Analysis of the Fine Scale Structure of Regulatory Elements Beer, Michael A. / Johns Hopkins University
NIH 2015 R01 HG	SVM-based Analysis of the Fine Scale Structure of Regulatory Elements Beer, Michael A. / Johns Hopkins University
NIH 2014 R01 HG	SVM-based Analysis of the Fine Scale Structure of Regulatory Elements Beer, Michael A. / Johns Hopkins University	$461,332
NIH 2013 R01 HG	SVM-based Analysis of the Fine Scale Structure of Regulatory Elements Beer, Michael A. / Johns Hopkins University	$460,570

Publications

Gate, Rachel E; Cheng, Christine S; Aiden, Aviva P et al. (2018) Genetic determinants of co-accessible chromatin regions in activated T cells across humans. Nat Genet 50:1140-1150

Migeon, Barbara R; Beer, Michael A; Bjornsson, Hans T (2017) Embryonic loss of human females with partial trisomy 19 identifies region critical for the single active X. PLoS One 12:e0170403

Beer, Michael A (2017) Predicting enhancer activity and variant impact using gkm-SVM. Hum Mutat 38:1251-1258

Kreimer, Anat; Zeng, Haoyang; Edwards, Matthew D et al. (2017) Predicting gene expression in massively parallel reporter assays: A comparative study. Hum Mutat 38:1240-1250

Mo, Alisa; Luo, Chongyuan; Davis, Fred P et al. (2016) Epigenomic landscapes of retinal rods and cones. Elife 5:e11613

Ghandi, Mahmoud; Mohammad-Noori, Morteza; Ghareghani, Narges et al. (2016) gkmSVM: an R package for gapped-kmer SVM. Bioinformatics 32:2205-7

Karnik, Rahul; Beer, Michael A (2015) Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space. PLoS One 10:e0140557

Pervouchine, Dmitri D; Djebali, Sarah; Breschi, Alessandra et al. (2015) Enhanced transcriptome maps from multiple mouse tissues reveal evolutionary constraint in gene expression. Nat Commun 6:5903

Lee, Dongwon; Gorkin, David U; Baker, Maggie et al. (2015) A method to predict the impact of regulatory variants from DNA sequence. Nat Genet 47:955-61

Lin, Shin; Lin, Yiing; Nery, Joseph R et al. (2014) Comparison of the transcriptional landscapes between human and mouse tissues. Proc Natl Acad Sci U S A 111:17224-9

Showing the most recent 10 out of 15 publications

Comments

Be the first to comment on Michael Beer's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: