Systematic functional annotation of human cis-regulatory genetic variation

Fraser, Hunter

Abstract

Abstract: A major goal of human genetics research is to functionally characterize polymorphisms in the human genome. Two recent projects-the HapMap and the 1000 Genomes Project- have achieved tremendous success in constructing a catalog of common variants and linkage disequilibrium in diverse populations. Indeed, the HapMap data have allowed the development of genome-wide association studies (GWAS), which have already implicated over two thousand loci associated with a wide range of diseases1. By the beginning of this proposal's funding period, the 1000 Genomes Project will have identified essentially all ~107 common (>1% minor allele frequency) polymorphisms in humans2. With this catalog in hand, the next great challenge for the field will be to functionally characterize this vast landscape of genetic variation. It is quite possible that the vast majority of these 107 variants will have no detectable functional consequences-though regardless of what fraction have functional effects, identifying the specific variants that do affect phenotypes will be a significant goal for human genetics research in the coming decade. Our objective is to develop a combination of experimental and computational tools that will allow the high-throughput assignment of functional consequences to thousands of human polymorphisms, including many of those implicated by GWAS. With a simple modification to standard chromatin immunoprecipitation and sequencing (""""""""ChIP-seq"""""""") methods3 involving pooling of samples, we will be able to genetically map molecular traits down to the level of single nucleotides with ~100-fold gains in efficiency and cost-effectiveness compared to alternative methods. Integrating the resulting functional polymorphism maps with eSNP and GWAS results will allow many polymorphisms affecting gene expression or disease risk to be simultaneously pinpointed and functionally characterized. Public Health Relevance: Our project will have tangible and wide-reaching effects on human health by greatly accelerating the process of pinpointing causal polymorphisms underlying diverse human diseases. Identifying causal polymorphisms is important for a number of reasons, e.g. for investigators to perform targeted follow-up studies and functional assays to better understand the polymorphism's mechanism of action, and its effects on disease. In addition, knowing the causal polymorphism will allow its genotyping in case/control cohorts to reveal the true population-attributable risk and odds ratio of the association, both of which are crucial to our understanding of each polymorphism's importance in disease;using other SNPs as proxies may substantially underestimate a polymorphism's contribution to disease.