The discovery of genome structural variants has increased exponentially with the application of next generation sequencing (NGS) technologies. The current approaches, however, are particularly biased against variants of a particular size and sequence context. Most notable is the skew against highly duplicated regions-regions enriched in genes and disease-causing variation. This program project will focus on the sequence characterization of more complex structural genetic variation with a particular emphasis on regions of biomedical relevance. The approach will be to leverage existing and new clone resources in combination with next-generation sequencing technologies to target variation that has not been adequately assessed for copy, content and structure.
The specific aims of this proposal are to 1) discover, sequence and integrate novel insertion sequences, duplicated regions of high diversity, and recurrent structural variants into the human reference genome;2) develop a next-generation sequencing based platform to accurately predict copy number and sequence content of duplicated genes for 2,000 genomes being analyzed as part of the 1000 Genomes Project;and 3) generate a BAC clone resource (n=18 individuals) and completely sequence and characterize structurally variant haplotypes for 20 biomedically relevant loci where structural variation predisposes to disease. This program project is a collaborative effort that brings together expertise in large scale genome sequencing, library production and structural variation. This work will provide fundamental information that will inform and complement efforts as part of the 1000 Genomes Project's Structural Variation Initiative and the Genome Reference Consortium. It will continue to develop the first high-quality reference set of sequenced variants, provide insight into the molecular mechanisms underlying these differences, and lead to the development of genot5rping platforms that will be needed to assess the phenotypic consequences of these regions in terms of human disease and adaptation.

Public Health Relevance

This program project will develop methods, resources and high-quality sequence data corresponding to human genome structural variation that predisposes to both common and rare human genetic diseases. The program particularly focuses on classes of genetic variation that are currently poorly understood as part of ongoing efforts to sequence genomes. The results of this work will provide insight into the mechanisms and risk factors leading to human disease;more fully explore the full spectrum of human genetic variation and lead to the detailed characterization of structural variant haplotypes of biomedical importance.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Program Projects (P01)
Project #
Application #
Study Section
Ethical, Legal, Social Implications Review Committee (GNOM)
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Washington
Schools of Medicine
United States
Zip Code
Eslami Rasekh, Marzieh; Chiatante, Giorgia; Miroballo, Mattia et al. (2017) Discovery of large genomic inversions using long range information. BMC Genomics 18:65
Watson, Corey T; Steinberg, Karyn Meltz; Graves, Tina A et al. (2015) Sequencing of the human IG light chain loci from a hydatidiform mole BAC library reveals locus-specific signatures of genetic diversity. Genes Immun 16:24-34
Nuttle, Xander; Itsara, Andy; Shendure, Jay et al. (2014) Resolving genomic disorder-associated breakpoints within segmental DNA duplications using massively parallel sequencing. Nat Protoc 9:1496-513
Lazaridis, Iosif; Patterson, Nick; Mittnik, Alissa et al. (2014) Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513:409-13
Huddleston, John; Ranade, Swati; Malig, Maika et al. (2014) Reconstructing complex regions of genomes using long-read sequencing technology. Genome Res 24:688-96
Stong, Nicholas; Deng, Zhong; Gupta, Ravi et al. (2014) Subtelomeric CTCF and cohesin binding site organization using improved subtelomere assemblies and a novel annotation pipeline. Genome Res 24:1039-50
Antonacci, Francesca; Dennis, Megan Y; Huddleston, John et al. (2014) Palindromic GOLGA8 core duplicons promote chromosome 15q13.3 microdeletion and evolutionary instability. Nat Genet 46:1293-302
Steinberg, Karyn Meltz; Schneider, Valerie A; Graves-Lindsay, Tina A et al. (2014) Single haplotype assembly of the human genome from a hydatidiform mole. Genome Res 24:2066-76
Mueller, Michael; Barros, Paula; Witherden, Abigail S et al. (2013) Genomic pathology of SLE-associated copy-number variation at the FCGR2C/FCGR3B/FCGR2B locus. Am J Hum Genet 92:28-40
Nuttle, Xander; Huddleston, John; O'Roak, Brian J et al. (2013) Rapid and accurate large-scale genotyping of duplicated genes and discovery of interlocus gene conversions. Nat Methods 10:903-9

Showing the most recent 10 out of 41 publications