The human genome exhibits extensive copy number variation (CNV). We today understand only the simplest form of copy number variation (CNV) - simple deletions and duplications. A large, functionally important and still-uncharacterized form of genome structural variation is multi-allelic copy-number variation (mCNV), involving genes and other functional elements for which three or more segregating alleles give rise to a wide range of copy numbers (such as 2 to 10) per diploid human genome. mCNVs have been refractory to widely used analysis methods and are not assessed in the genome-scale molecular or statistical approaches used to study genetically complex phenotypes in humans. In this work, we will develop approaches and supporting data sets that enable mCNVs to be routinely and rigorously analyzed for relationship to variation in human phenotypes. We will accurately analyze mCNVs in reference populations, using two new approaches, one computational (based on analysis of available whole-genome sequence data) and one molecular (based on PCR in digitally counted microdroplets) for accurately analyzing mCNVs in cohorts (Aim 1). By analyzing these data in a statistical framework that incorporates information about genotypes, allele frequencies, inheritance, and haplotypes, we will place mCNV alleles onto the haplotype maps created by HapMap and 1000 Genomes, and render mCNVs accessible to genotype imputation to the fullest extent possible (Aim 2). We will deeply characterize mCNVs at ten biomedically important loci, to understand these polymorphisms at the levels of population genetics, mutational rates and histories, and relationships to clinical phenotypes (Aim 3). Finally, we will pilot inexpensive in silico genome-wide association studies for mCNVs based on statistical imputation into existing GWAS data sets (Aim 4). The successful completion of this work will lead to the discovery of relationships between disease risk and gene dosage, helping to reveal the molecular etiology of human disease.

Public Health Relevance

Variation in the human genome influences risk of disease and can be used to find the genes underlying each disease, leading to new ideas for therapies. Many genes can exist in very different numbers of copies (such as 0 to 12) in different peoples'genomes;this form of variation is today not understood well. Our work will help to understand this form of genome variation and enable many human geneticists to find specific genes that relate to each disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG006855-02
Application #
8532954
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Brooks, Lisa
Project Start
2012-08-18
Project End
2016-05-31
Budget Start
2013-06-01
Budget End
2014-05-31
Support Year
2
Fiscal Year
2013
Total Cost
$477,500
Indirect Cost
$191,028
Name
Harvard University
Department
Genetics
Type
Schools of Medicine
DUNS #
047006379
City
Boston
State
MA
Country
United States
Zip Code
02115
Sekar, Aswin; Bialas, Allison R; de Rivera, Heather et al. (2016) Schizophrenia risk from complex variation of complement component 4. Nature 530:177-83
Genovese, Giulio; Fromer, Menachem; Stahl, Eli A et al. (2016) Increased burden of ultra-rare protein-altering variants among 4,877 individuals with schizophrenia. Nat Neurosci 19:1433-1441
Ganna, Andrea; Genovese, Giulio; Howrigan, Daniel P et al. (2016) Ultra-rare disruptive and damaging mutations influence educational attainment in the general population. Nat Neurosci 19:1563-1565
Boettger, Linda M; Salem, Rany M; Handsaker, Robert E et al. (2016) Recurring exon deletions in the HP (haptoglobin) gene contribute to lower blood cholesterol levels. Nat Genet 48:359-66
Handsaker, Robert E; Van Doren, Vanessa; Berman, Jennifer R et al. (2015) Large multiallelic copy number variations in humans. Nat Genet 47:296-303
1000 Genomes Project Consortium; Auton, Adam; Brooks, Lisa D et al. (2015) A global reference for human genetic variation. Nature 526:68-74
Usher, Christina L; Handsaker, Robert E; Esko, Tõnu et al. (2015) Structural forms of the human amylase locus and their relationships to SNPs, haplotypes and obesity. Nat Genet 47:921-5
Usher, Christina L; McCarroll, Steven A (2015) Complex and multi-allelic copy number variation in human disease. Brief Funct Genomics 14:329-38
Genovese, Giulio; Kähler, Anna K; Handsaker, Robert E et al. (2014) Clonal hematopoiesis and blood-cancer risk inferred from blood DNA sequence. N Engl J Med 371:2477-87
Koren, Amnon; Handsaker, Robert E; Kamitaki, Nolan et al. (2014) Genetic variation in human DNA replication timing. Cell 159:1015-26

Showing the most recent 10 out of 13 publications