The human genome exhibits extensive copy number variation (CNV). We today understand only the simplest form of copy number variation (CNV) - simple deletions and duplications. A large, functionally important and still-uncharacterized form of genome structural variation is multi-allelic copy-number variation (mCNV), involving genes and other functional elements for which three or more segregating alleles give rise to a wide range of copy numbers (such as 2 to 10) per diploid human genome. mCNVs have been refractory to widely used analysis methods and are not assessed in the genome-scale molecular or statistical approaches used to study genetically complex phenotypes in humans. In this work, we will develop approaches and supporting data sets that enable mCNVs to be routinely and rigorously analyzed for relationship to variation in human phenotypes. We will accurately analyze mCNVs in reference populations, using two new approaches, one computational (based on analysis of available whole-genome sequence data) and one molecular (based on PCR in digitally counted microdroplets) for accurately analyzing mCNVs in cohorts (Aim 1). By analyzing these data in a statistical framework that incorporates information about genotypes, allele frequencies, inheritance, and haplotypes, we will place mCNV alleles onto the haplotype maps created by HapMap and 1000 Genomes, and render mCNVs accessible to genotype imputation to the fullest extent possible (Aim 2). We will deeply characterize mCNVs at ten biomedically important loci, to understand these polymorphisms at the levels of population genetics, mutational rates and histories, and relationships to clinical phenotypes (Aim 3). Finally, we will pilot inexpensive in silico genome-wide association studies for mCNVs based on statistical imputation into existing GWAS data sets (Aim 4). The successful completion of this work will lead to the discovery of relationships between disease risk and gene dosage, helping to reveal the molecular etiology of human disease.

Public Health Relevance

Variation in the human genome influences risk of disease and can be used to find the genes underlying each disease, leading to new ideas for therapies. Many genes can exist in very different numbers of copies (such as 0 to 12) in different peoples'genomes;this form of variation is today not understood well. Our work will help to understand this form of genome variation and enable many human geneticists to find specific genes that relate to each disease.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Harvard University
Schools of Medicine
United States
Zip Code
Bell, Avery Davis; Usher, Christina L; McCarroll, Steven A (2018) Analyzing Copy Number Variation with Droplet Digital PCR. Methods Mol Biol 1768:143-160
Kamitaki, Nolan; Usher, Christina L; McCarroll, Steven A (2018) Using Droplet Digital PCR to Analyze Allele-Specific RNA Expression. Methods Mol Biol 1768:401-422
Loh, Po-Ru; Genovese, Giulio; Handsaker, Robert E et al. (2018) Insights into clonal haematopoiesis from 8,342 mosaic chromosomal alterations. Nature 559:350-355
Estrada, Karol; Whelan, Christopher W; Zhao, Fengmei et al. (2018) A whole-genome sequence study identifies genetic risk factors for neuromyelitis optica. Nat Commun 9:1929
Ganna, Andrea; Satterstrom, F Kyle; Zekavat, Seyedeh M et al. (2018) Quantifying the Impact of Rare and Ultra-rare Coding Variation across the Phenotypic Spectrum. Am J Hum Genet 102:1204-1211
Merkle, Florian T; Ghosh, Sulagna; Kamitaki, Nolan et al. (2017) Human pluripotent stem cells recurrently acquire and expand dominant negative P53 mutations. Nature 545:229-233
Genovese, Giulio; Fromer, Menachem; Stahl, Eli A et al. (2016) Increased burden of ultra-rare protein-altering variants among 4,877 individuals with schizophrenia. Nat Neurosci 19:1433-1441
Sekar, Aswin; Bialas, Allison R; de Rivera, Heather et al. (2016) Schizophrenia risk from complex variation of complement component 4. Nature 530:177-83
Boettger, Linda M; Salem, Rany M; Handsaker, Robert E et al. (2016) Recurring exon deletions in the HP (haptoglobin) gene contribute to lower blood cholesterol levels. Nat Genet 48:359-66
Ganna, Andrea; Genovese, Giulio; Howrigan, Daniel P et al. (2016) Ultra-rare disruptive and damaging mutations influence educational attainment in the general population. Nat Neurosci 19:1563-1565

Showing the most recent 10 out of 19 publications