Mental illnesses are some of the most devastating diseases affecting human populations, placing a huge burden on individuals, families and society. Genome-wide association studies (GWAS) have identified dozens of common single nucleotide polymorphisms (SNPs) that are associated with psychiatric diseases, but a majority of those SNPs have been mapped to intergenic or intronic regions and are functionally unclassified. Existing software or algorithms only query multiple databases and produce lists of hits without intelligent integration and ignore much of the valuable regulatory information. The overall goal of this proposal is to integrate all available genetic, genomic and epigenomic data to generate a probability-based prediction about a SNP's influence on gene expression level in brain. Our previous studies have shown that psychiatric GWAS signals are enriched with brain eQTL SNPs (eSNPs), and these brain eSNPs are likely to be functional and contribute to disease susceptibilities. We will use SNPs in eQTLs to anchor a chain of evidence incorporating histone marks, conserved sequences, transcription factor binding sites, DNA methylation, accessible chromatins, non-coding RNA, and other data. We will use a machine learning method to predict regulatory SNPs based on known relationships between these epigenetic marks and their target genes, as well as their distinct patterns in genome. We will also use our novel unsupervised deconvolution algorithm to extract cell-type (i.e., neuron vs. non-neuron) specific measures from heterogeneous brain tissue data to improve our predictions. We will use both statistical and experimental methods to validate the predictions. Quantitative PCR and CRISPR-cas9 will be used on induced pluripotent cell lines to compare gene expression levels of alleles of predicted functional SNPs. Both algorithm and predicted functional variants will made public via a website and standalone application. The novel algorithm will significantly improve our understanding of psychiatric disease genetics by uncovering the gene-regulatory functions for disease-associated, non-coding SNPs.

Public Health Relevance

Genome-wide association studies (GWAS) have identified thousands of common SNPs associated with major complex diseases, but the majority of those SNPs are located in non-coding regions, leaving those genetic associations functionally unexplained. Existing functional predication software or algorithms only query some databases without providing statistical or biological integration, and dismiss much valuable regulatory information. We propose to integrate all the available genetic, genomic and epigenomic data and use machine learning to produce a probability-based prediction about a SNP's influence on gene expression levels in brain. We will also use a novel unsupervised deconvolution algorithm to extract cell-type specific measures from heterogeneous brain tissue data to improve our prediction. We will use both statistical and experimental methods to validate the predictions. Both algorithm and predicted functional variants will be made public via a website and standalone application. The novel algorithm will significantly improve our understanding of psychiatric disease genetics by giving those non-coding, disease-associated SNPs meaningful biological functions.

Agency
National Institute of Health (NIH)
Institute
National Institute of Environmental Health Sciences (NIEHS)
Type
Research Project (R01)
Project #
1R01ES024988-01
Application #
8815564
Study Section
Special Emphasis Panel (ZRG1-IMST-R (51))
Program Officer
Chadwick, Lisa
Project Start
2014-09-10
Project End
2016-08-31
Budget Start
2014-09-10
Budget End
2015-08-31
Support Year
1
Fiscal Year
2014
Total Cost
$325,984
Indirect Cost
$86,710
Name
University of Illinois at Chicago
Department
Psychiatry
Type
Schools of Medicine
DUNS #
098987217
City
Chicago
State
IL
Country
United States
Zip Code
60612
Grennan, Kay S; Chen, Chao; Gershon, Elliot S et al. (2014) Molecular network analysis enhances understanding of the biology of mental disorders. Bioessays 36:606-16