This proposal aims to fine map a gene sub-network that is robustly associated with alcohol dependence (AD). AD is extremely costly to individuals and to society in the United States and throughout the world. Family, twin, and adoption studies have established a genetic contribution to the risk for AD. We recently identified a sub-network of 39 genes that collectively contribute to the susceptibility for AD through the integrated analysis of genome-wide association studies (GWAS) and human protein-protein interaction networks, using two GWAS datasets from the Study of Addiction: Genetics and Environment (SAGE) and the Collaborative Study on the Genetics of Alcoholism (COGA). We replicated the association of this gene sub-network with AD in three independent samples, including the European-ancestry Australian sample from the GWAS of alcohol use and alcohol use disorder in Australian Twin-Families (p = 0.006), and two samples of European-Americans (EA) (p = 0.0001) and African-Americans (AA) (p = 0.007) at Yale. Functional enrichment analysis revealed that the sub-network is enriched for genes involved in cation transport, synaptic transmission, and transmission of nerve impulse. We now aim to refine candidate causal genes and determine candidate causal variants within the gene sub-network. To accomplish this, we propose to follow up the 16 most promising candidate genes in the sub-network using targeted next-generation sequencing, advanced statistical genetics and bioinformatics approaches. The current project aims to fine map a gene sub-network that is robustly associated with alcohol dependence (AD). The findings from this project would help improve our understanding of the biological mechanisms that underlie AD, moving us closer to designing effective prevention and treatment for the disorder.
Our specific aims are: 1) whole gene-based targeted sequencing of candidate genes. Here we seek to identify all sequence variants, including coding and noncoding, for the most promising candidate genes selected from the sub-network. We will sequence the whole genes in 500 cases and 500 controls taken from the EA portion of COGA using the SureSelect Target Enrichment system and the Illumina HiSeq 2000. The sequence data will be analyzed and annotated using a state-of-the-art bioinformatics pipeline;2) Identification of rare causal variants. We will use logistic regression to test the association of each low-frequency variant (0.005 Public Health Relevance