Population-Based Approaches to Genome Structure and Structural Variation

McCarroll, Steven

Abstract

Over the coming years, human genetics will sequence tens of thousands of whole genomes, enabled by profound reduction in the costs of sequencing. These data offer unprecedented opportunities to ascertain how the human genome varies. Our interest is in understanding how human genomes are structured and vary at large scales ? from the kilobase scale up to entire chromosome arms. A basic challenge in this area of research has involved how to use short (150 bp) sequence reads to infer genomic relationships that play out at far-larger spatial scales. Of course, one approach to this is to look toward emerging genomic technologies (such as long-read technologies) to eventually solve this problem; while there is much interesting work on emerging technologies, our focus is on learning the greatest possible amount from the kinds of data that are already being generated in great abundance ? on tens of thousands of genomes of individuals with many diseases and other clinical phenotypes. We believe that this can be accomplished by creatively analyzing the statistical patterns that large collections of sequence reads form across individuals, families, and populations. In recent years, we used existing whole-genome-sequence and whole-exome-sequence data to discover surprising basic principles related to multi-allelic CNVs, human genome replication, and ?missing pieces? of the reference human genome. In the coming years, we aim to use emerging WGS data to more deeply understand complex and multi-allelic CNVs, reveal the genome sequence variation within duplicated sequences, map dispersed duplications, and ascertain somatic mosaicism. We hope that this work contributes to many discoveries about the genetic and biological basis of disease.

Public Health Relevance

Human genomes vary at scales large and small; genetic variation that associates with disease can offer powerful clues about the biological basis of disease. The goal of our work is to develop new and powerful ways to use emerging genome-sequencing data to understand how human genomes vary at large scales, and which of these variations underlies risk of disease.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project (R01)
Project #: 5R01HG006855-06
Application #: 9335937
Study Section: Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer: Brooks, Lisa

Project Start: 2012-08-18
Project End: 2020-05-31
Budget Start: 2017-06-01
Budget End: 2018-05-31
Support Year: 6
Fiscal Year: 2017
Total Cost: $655,416
Indirect Cost: $268,740

Institution

Name: Harvard Medical School
Department: Genetics
Type: Schools of Medicine
DUNS #: 047006379

City: Boston
State: MA
Country: United States
Zip Code: 02115

Related projects


NIH 2019 R01 HG	Population-Based Approaches to Genome Structure and Structural Variation McCarroll, Steven Andrew / Harvard Medical School
NIH 2018 R01 HG	Population-Based Approaches to Genome Structure and Structural Variation McCarroll, Steven Andrew / Harvard Medical School
NIH 2017 R01 HG	Population-Based Approaches to Genome Structure and Structural Variation McCarroll, Steven Andrew / Harvard Medical School	$655,416
NIH 2016 R01 HG	Population-Based Approaches to Genome Structure and Structural Variation McCarroll, Steven Andrew / Harvard Medical School	$655,416
NIH 2015 R01 HG	Multi-allelic copy number variation of the human genome McCarroll, Steven Andrew / Harvard Medical School	$487,499
NIH 2014 R01 HG	Multi-allelic copy number variation of the human genome McCarroll, Steven Andrew / Harvard University	$490,000
NIH 2013 R01 HG	Multi-allelic copy number variation of the human genome McCarroll, Steven Andrew / Harvard University	$477,500
NIH 2012 R01 HG	Multi-allelic copy number variation of the human genome McCarroll, Steven Andrew / Harvard University	$500,000

Publications

Ganna, Andrea; Satterstrom, F Kyle; Zekavat, Seyedeh M et al. (2018) Quantifying the Impact of Rare and Ultra-rare Coding Variation across the Phenotypic Spectrum. Am J Hum Genet 102:1204-1211

Bell, Avery Davis; Usher, Christina L; McCarroll, Steven A (2018) Analyzing Copy Number Variation with Droplet Digital PCR. Methods Mol Biol 1768:143-160

Kamitaki, Nolan; Usher, Christina L; McCarroll, Steven A (2018) Using Droplet Digital PCR to Analyze Allele-Specific RNA Expression. Methods Mol Biol 1768:401-422

Loh, Po-Ru; Genovese, Giulio; Handsaker, Robert E et al. (2018) Insights into clonal haematopoiesis from 8,342 mosaic chromosomal alterations. Nature 559:350-355

Estrada, Karol; Whelan, Christopher W; Zhao, Fengmei et al. (2018) A whole-genome sequence study identifies genetic risk factors for neuromyelitis optica. Nat Commun 9:1929

Merkle, Florian T; Ghosh, Sulagna; Kamitaki, Nolan et al. (2017) Human pluripotent stem cells recurrently acquire and expand dominant negative P53 mutations. Nature 545:229-233

Ganna, Andrea; Genovese, Giulio; Howrigan, Daniel P et al. (2016) Ultra-rare disruptive and damaging mutations influence educational attainment in the general population. Nat Neurosci 19:1563-1565

Genovese, Giulio; Fromer, Menachem; Stahl, Eli A et al. (2016) Increased burden of ultra-rare protein-altering variants among 4,877 individuals with schizophrenia. Nat Neurosci 19:1433-1441

Sekar, Aswin; Bialas, Allison R; de Rivera, Heather et al. (2016) Schizophrenia risk from complex variation of complement component 4. Nature 530:177-83

Boettger, Linda M; Salem, Rany M; Handsaker, Robert E et al. (2016) Recurring exon deletions in the HP (haptoglobin) gene contribute to lower blood cholesterol levels. Nat Genet 48:359-66

Showing the most recent 10 out of 19 publications

Comments

Be the first to comment on Steven McCarroll's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: