The discovery of genome structural variants has increased exponentially with the application of next generation sequencing (NGS) technologies. The current approaches, however, are particularly biased against variants of a particular size and sequence context. Most notable is the skew against highly duplicated regions-regions enriched in genes and disease-causing variation. This program project will focus on the sequence characterization of more complex structural genetic variation with a particular emphasis on regions of biomedical relevance. The approach will be to leverage existing and new clone resources in combination with next-generation sequencing technologies to target variation that has not been adequately assessed for copy, content and structure.
The specific aims of this proposal are to 1) discover, sequence and integrate novel insertion sequences, duplicated regions of high diversity, and recurrent structural variants into the human reference genome;2) develop a next-generation sequencing based platform to accurately predict copy number and sequence content of duplicated genes for 2,000 genomes being analyzed as part of the 1000 Genomes Project;and 3) generate a BAC clone resource (n=18 individuals) and completely sequence and characterize structurally variant haplotypes for 20 biomedically relevant loci where structural variation predisposes to disease. This program project is a collaborative effort that brings together expertise in large scale genome sequencing, library production and structural variation. This work will provide fundamental information that will inform and complement efforts as part of the 1000 Genomes Project's Structural Variation Initiative and the Genome Reference Consortium. It will continue to develop the first high-quality reference set of sequenced variants, provide insight into the molecular mechanisms underlying these differences, and lead to the development of genot5rping platforms that will be needed to assess the phenotypic consequences of these regions in terms of human disease and adaptation.
This program project will develop methods, resources and high-quality sequence data corresponding to human genome structural variation that predisposes to both common and rare human genetic diseases. The program particularly focuses on classes of genetic variation that are currently poorly understood as part of ongoing efforts to sequence genomes. The results of this work will provide insight into the mechanisms and risk factors leading to human disease;more fully explore the full spectrum of human genetic variation and lead to the detailed characterization of structural variant haplotypes of biomedical importance.
|Watson, C T; Steinberg, K M; Graves, T A et al. (2015) Sequencing of the human IG light chain loci from a hydatidiform mole BAC library reveals locus-specific signatures of genetic diversity. Genes Immun 16:24-34|
|Huddleston, John; Ranade, Swati; Malig, Maika et al. (2014) Reconstructing complex regions of genomes using long-read sequencing technology. Genome Res 24:688-96|
|Stong, Nicholas; Deng, Zhong; Gupta, Ravi et al. (2014) Subtelomeric CTCF and cohesin binding site organization using improved subtelomere assemblies and a novel annotation pipeline. Genome Res 24:1039-50|
|Steinberg, Karyn Meltz; Schneider, Valerie A; Graves-Lindsay, Tina A et al. (2014) Single haplotype assembly of the human genome from a hydatidiform mole. Genome Res 24:2066-76|
|Antonacci, Francesca; Dennis, Megan Y; Huddleston, John et al. (2014) Palindromic GOLGA8 core duplicons promote chromosome 15q13.3 microdeletion and evolutionary instability. Nat Genet 46:1293-302|
|Lazaridis, Iosif; Patterson, Nick; Mittnik, Alissa et al. (2014) Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513:409-13|
|Nuttle, Xander; Itsara, Andy; Shendure, Jay et al. (2014) Resolving genomic disorder-associated breakpoints within segmental DNA duplications using massively parallel sequencing. Nat Protoc 9:1496-513|
|Mueller, Michael; Barros, Paula; Witherden, Abigail S et al. (2013) Genomic pathology of SLE-associated copy-number variation at the FCGR2C/FCGR3B/FCGR2B locus. Am J Hum Genet 92:28-40|
|Watson, Corey T; Steinberg, Karyn M; Huddleston, John et al. (2013) Complete haplotype sequence of the human immunoglobulin heavy-chain variable, diversity, and joining genes and characterization of allelic and copy-number variation. Am J Hum Genet 92:530-46|
|Nuttle, Xander; Huddleston, John; O'Roak, Brian J et al. (2013) Rapid and accurate large-scale genotyping of duplicated genes and discovery of interlocus gene conversions. Nat Methods 10:903-9|
Showing the most recent 10 out of 41 publications