The discovery of genome structural variants has increased exponentially with the application of next generation sequencing (NGS) technologies. The current approaches, however, are particularly biased against variants of a particular size and sequence context. Most notable is the skew against highly duplicated regions-regions enriched in genes and disease-causing variation. This program project will focus on the sequence characterization of more complex structural genetic variation with a particular emphasis on regions of biomedical relevance. The approach will be to leverage existing and new clone resources in combination with next-generation sequencing technologies to target variation that has not been adequately assessed for copy, content and structure.
The specific aims of this proposal are to 1) discover, sequence and integrate novel insertion sequences, duplicated regions of high diversity, and recurrent structural variants into the human reference genome;2) develop a next-generation sequencing based platform to accurately predict copy number and sequence content of duplicated genes for 2,000 genomes being analyzed as part of the 1000 Genomes Project;and 3) generate a BAC clone resource (n=18 individuals) and completely sequence and characterize structurally variant haplotypes for 20 biomedically relevant loci where structural variation predisposes to disease. This program project is a collaborative effort that brings together expertise in large scale genome sequencing, library production and structural variation. This work will provide fundamental information that will inform and complement efforts as part of the 1000 Genomes Project's Structural Variation Initiative and the Genome Reference Consortium. It will continue to develop the first high-quality reference set of sequenced variants, provide insight into the molecular mechanisms underlying these differences, and lead to the development of genot5rping platforms that will be needed to assess the phenotypic consequences of these regions in terms of human disease and adaptation.

Public Health Relevance

This program project will develop methods, resources and high-quality sequence data corresponding to human genome structural variation that predisposes to both common and rare human genetic diseases. The program particularly focuses on classes of genetic variation that are currently poorly understood as part of ongoing efforts to sequence genomes. The results of this work will provide insight into the mechanisms and risk factors leading to human disease;more fully explore the full spectrum of human genetic variation and lead to the detailed characterization of structural variant haplotypes of biomedical importance.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Program Projects (P01)
Project #
5P01HG004120-06
Application #
8293415
Study Section
Ethical, Legal, Social Implications Review Committee (GNOM)
Program Officer
Brooks, Lisa
Project Start
2007-06-21
Project End
2014-06-30
Budget Start
2012-07-01
Budget End
2013-06-30
Support Year
6
Fiscal Year
2012
Total Cost
$1,055,249
Indirect Cost
$260,161
Name
University of Washington
Department
Genetics
Type
Schools of Medicine
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195
Watson, C T; Steinberg, K M; Graves, T A et al. (2015) Sequencing of the human IG light chain loci from a hydatidiform mole BAC library reveals locus-specific signatures of genetic diversity. Genes Immun 16:24-34
Huddleston, John; Ranade, Swati; Malig, Maika et al. (2014) Reconstructing complex regions of genomes using long-read sequencing technology. Genome Res 24:688-96
Antonacci, Francesca; Dennis, Megan Y; Huddleston, John et al. (2014) Palindromic GOLGA8 core duplicons promote chromosome 15q13.3 microdeletion and evolutionary instability. Nat Genet 46:1293-302
Nuttle, Xander; Itsara, Andy; Shendure, Jay et al. (2014) Resolving genomic disorder-associated breakpoints within segmental DNA duplications using massively parallel sequencing. Nat Protoc 9:1496-513
Stong, Nicholas; Deng, Zhong; Gupta, Ravi et al. (2014) Subtelomeric CTCF and cohesin binding site organization using improved subtelomere assemblies and a novel annotation pipeline. Genome Res 24:1039-50
Lazaridis, Iosif; Patterson, Nick; Mittnik, Alissa et al. (2014) Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513:409-13
Watson, Corey T; Steinberg, Karyn M; Huddleston, John et al. (2013) Complete haplotype sequence of the human immunoglobulin heavy-chain variable, diversity, and joining genes and characterization of allelic and copy-number variation. Am J Hum Genet 92:530-46
Nuttle, Xander; Huddleston, John; O'Roak, Brian J et al. (2013) Rapid and accurate large-scale genotyping of duplicated genes and discovery of interlocus gene conversions. Nat Methods 10:903-9
Itsara, Andy; Vissers, Lisenka E L M; Steinberg, Karyn Meltz et al. (2012) Resolving the breakpoints of the 17q21.31 microdeletion syndrome with next-generation sequencing. Am J Hum Genet 90:599-613
Hormozdiari, Fereydoun; Alkan, Can; Ventura, Mario et al. (2011) Alu repeat discovery and characterization within human genomes. Genome Res 21:840-9

Showing the most recent 10 out of 34 publications