Over the coming years, human genetics will sequence tens of thousands of whole genomes, enabled by profound reduction in the costs of sequencing. These data offer unprecedented opportunities to ascertain how the human genome varies. Our interest is in understanding how human genomes are structured and vary at large scales ? from the kilobase scale up to entire chromosome arms. A basic challenge in this area of research has involved how to use short (150 bp) sequence reads to infer genomic relationships that play out at far-larger spatial scales. Of course, one approach to this is to look toward emerging genomic technologies (such as long-read technologies) to eventually solve this problem; while there is much interesting work on emerging technologies, our focus is on learning the greatest possible amount from the kinds of data that are already being generated in great abundance ? on tens of thousands of genomes of individuals with many diseases and other clinical phenotypes. We believe that this can be accomplished by creatively analyzing the statistical patterns that large collections of sequence reads form across individuals, families, and populations. In recent years, we used existing whole-genome-sequence and whole-exome-sequence data to discover surprising basic principles related to multi-allelic CNVs, human genome replication, and ?missing pieces? of the reference human genome. In the coming years, we aim to use emerging WGS data to more deeply understand complex and multi-allelic CNVs, reveal the genome sequence variation within duplicated sequences, map dispersed duplications, and ascertain somatic mosaicism. We hope that this work contributes to many discoveries about the genetic and biological basis of disease.

Public Health Relevance

Human genomes vary at scales large and small; genetic variation that associates with disease can offer powerful clues about the biological basis of disease. The goal of our work is to develop new and powerful ways to use emerging genome-sequencing data to understand how human genomes vary at large scales, and which of these variations underlies risk of disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG006855-07
Application #
9492398
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Brooks, Lisa
Project Start
2012-08-18
Project End
2020-05-31
Budget Start
2018-06-01
Budget End
2019-05-31
Support Year
7
Fiscal Year
2018
Total Cost
Indirect Cost
Name
Harvard Medical School
Department
Genetics
Type
Schools of Medicine
DUNS #
047006379
City
Boston
State
MA
Country
United States
Zip Code
Ganna, Andrea; Satterstrom, F Kyle; Zekavat, Seyedeh M et al. (2018) Quantifying the Impact of Rare and Ultra-rare Coding Variation across the Phenotypic Spectrum. Am J Hum Genet 102:1204-1211
Bell, Avery Davis; Usher, Christina L; McCarroll, Steven A (2018) Analyzing Copy Number Variation with Droplet Digital PCR. Methods Mol Biol 1768:143-160
Kamitaki, Nolan; Usher, Christina L; McCarroll, Steven A (2018) Using Droplet Digital PCR to Analyze Allele-Specific RNA Expression. Methods Mol Biol 1768:401-422
Loh, Po-Ru; Genovese, Giulio; Handsaker, Robert E et al. (2018) Insights into clonal haematopoiesis from 8,342 mosaic chromosomal alterations. Nature 559:350-355
Estrada, Karol; Whelan, Christopher W; Zhao, Fengmei et al. (2018) A whole-genome sequence study identifies genetic risk factors for neuromyelitis optica. Nat Commun 9:1929
Merkle, Florian T; Ghosh, Sulagna; Kamitaki, Nolan et al. (2017) Human pluripotent stem cells recurrently acquire and expand dominant negative P53 mutations. Nature 545:229-233
Genovese, Giulio; Fromer, Menachem; Stahl, Eli A et al. (2016) Increased burden of ultra-rare protein-altering variants among 4,877 individuals with schizophrenia. Nat Neurosci 19:1433-1441
Sekar, Aswin; Bialas, Allison R; de Rivera, Heather et al. (2016) Schizophrenia risk from complex variation of complement component 4. Nature 530:177-83
Boettger, Linda M; Salem, Rany M; Handsaker, Robert E et al. (2016) Recurring exon deletions in the HP (haptoglobin) gene contribute to lower blood cholesterol levels. Nat Genet 48:359-66
Ganna, Andrea; Genovese, Giulio; Howrigan, Daniel P et al. (2016) Ultra-rare disruptive and damaging mutations influence educational attainment in the general population. Nat Neurosci 19:1563-1565

Showing the most recent 10 out of 19 publications