The incidence of diagnosed psychiatric disorders has been increasing for decades, leaving millions of afflicted individuals. Despite the high heritability, their underlying molecular mechanisms remain elusive. Most risk loci are located in noncoding genomic elements without direct effects on protein products. Comprehensive functional annotation and variant impact quantification are essential to provide new molecular insights and discover therapeutic targets. Recent advances in novel sequencing technologies and community efforts to share genomic data provide unprecedented opportunities to understand how genetic variants contribute to psychiatric diseases. This application describes the development of integrative strategies and machine learning methods to combine novel assays (such as STARR-seq) with population-scale genomic profiles to elucidate the genetic regulatory grammar in the human prefrontal cortex (PFC) and to prioritize genetic variants in psychiatric disorders. Specifically, we will (1) dissect the cis- regulatory landscape of the PFC using population-scale epigenetics data, (2) construct multi- model gene regulatory networks by linking distal cis-regulatory elements to genes using chromatin co-variability analyses, (3) integrate genetic, epigenetic, and transcriptional data to identify key transcription factors and variants that contribute to psychiatric disorders. Distinct from existing efforts focusing on one genome, this proposed work presents a truly novel big-data approach for both modeling gene regulation and investigating disease-risk factors by incorporating heterogeneous multi-omics profiles from hundreds of individuals. The resultant comprehensive list of cis-regulatory elements will expand the number of known functional regions in the human brain by at least an order. We will release our methods and resources in the form of web services, distributed open-source software, and annotation databases, which will also benefit other investigators exploring the genetic underpinnings of neuropsychiatric disorders. In addition to its scientific content, this application proposes a comprehensive training program for preparing an independent investigator in computational genomics and neurogenetics. This training will take place at Yale University (in the Dept. of Molecular Biophysics and Biochemistry) under the mentorship of Prof. Mark Gerstein (functional genomics), Prof. Nenad Sestan (neurogenetics), and Prof. Hongyu Zhao (statistical genetics and machine learning). A committee of experienced psychiatric disease experts and data scientists will also provide advice on both scientific research and career development.

Public Health Relevance

The proposed study is to leverage advanced machine learning methods to discover cis- regulatory elements in the noncoding genome by incorporating novel functional characterization technologies (such as STARR-seq) with genetic, epigenetic, and transcriptomic data from psychiatric patients. In contrast to existing methods that rely on a single genome, this work assumes a dynamic regulatory program in each individual and explores the regulatory heterogeneity across hundreds of epigenomes, thereby significantly expanding our current knowledge of the noncoding genome and facilitating functional interpretation of genetic variants. Furthermore, it will also generate publicly available annotation resources and software packages for the scientific community, thereby significantly accelerating the pace of novel therapeutic target discovery for treating psychiatric disorders.

Agency
National Institute of Health (NIH)
Institute
National Institute of Mental Health (NIMH)
Type
Research Scientist Development Award - Research & Training (K01)
Project #
1K01MH123896-01
Application #
10039384
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Van'T Veer, Ashlee V
Project Start
2020-07-17
Project End
2024-06-30
Budget Start
2020-07-17
Budget End
2021-06-30
Support Year
1
Fiscal Year
2020
Total Cost
Indirect Cost
Name
University of California Irvine
Department
Biostatistics & Other Math Sci
Type
Computer Center
DUNS #
046705849
City
Irvine
State
CA
Country
United States
Zip Code
92617