Precision Medicine refers to the customization of medical treatment to the individual characteristics of each patient. The Million Veteran Program (MVP) provides a unique opportunity to perform large-scale genome-wide association studies (GWAS) and further our understanding of Precision Medicine across multiple traits and diseases. While well powered GWAS have identified multiple risk variants, there has been limited conclusive findings on the genetic factors contributing to complex traits due to small effect sizes. In addition, the majority of common risk variants are within non-coding regions of the genome and, as such, the functional relevance of most discovered loci remains unclear. Our group and others have shown that a large portion of phenotypic variability in disease risk can be explained by regulatory variants, i.e. genetic variants that affect epigenetic mechanisms and the expression levels of genes. Studying gene expression and epigenome changes directly in MVP samples is not feasible as such data are not available. To overcome these limitations, we propose to apply a machine learning approach that leverages existing molecular data (unrelated to MVP) as a reference panel and directly impute multi-tissue and genome-wide gene expression and epigenome profiles in MVP samples using the existing MVP genotypes. As reference panel, we will use large-scale datasets with genotyping and molecular profiling that our group and others have generated, including, but not limited to, the CommonMind consortium, psychENCODE, AD-AMP, STARNET and GTEx. Imputed MVP gene expression and epigenome data provides a powerful cohort to ?translate? genetic findings to dysregulation of specific molecular pathways across multiple traits that will enhance drug discovery. We propose to study gene expression and epigenome perturbations in neuropsychiatric -- including schizophrenia, bipolar disorder, post- traumatic stress disorder, alcohol abuse, recurrent depression and suicidal ideations -- and cardiometabolic -- including type 2 diabetes, hypertension, hyperlipidemia, coronary heart disease, history of myocardial infarction and bloodwork-quantified (glucose, Hb1Ac and lipid profile) -- traits. These disease-associated signatures can be further explored in terms of enrichment with specific molecular networks. We propose to construct tissue specific weighted gene-gene interaction and causal probabilistic networks and assess the enrichment with disease-associated signatures to identify subnetworks, molecular processes and key drivers. Overall, the scale of data generation and its integration into predictive models will provide a wealth of data for other diseases beyond the immediate goals of this proposal that have the potential to increase our understanding of Precision Medicine.
The Million Veteran Program (MVP) is a huge project initiated by the Department of Veterans Affairs, which constitutes the largest genomic database in the world. MVP provides a unique opportunity to perform large- scale genetic analyses and further our understanding of Precision Medicine across multiple traits and diseases. Here we propose to leverage MVP data and perform a comprehensive analysis to investigate the functional roles of variants associated with disease that affect a great proportion of our Veterans, included but not limited to post-traumatic stress disorder, alcohol abuse, depression, type 2 diabetes, hypertension and hyperlipidemia. In addition to the tremendous burden of suffering and economic costs, these diseases increase the mortality rate among Veterans. Therefore, the need for a basic understanding of the pathophysiology of these diseases and developing customized treatment or prevention to the individual characteristics of each patient is urgent.