The human genome is arguably one of the most well-annotated and best-assembled mammalian genomes;nevertheless, important gaps remain in our understanding of its sequence organization, function, and evolution. Our genome is particularly enriched for complex segmental duplications, which harbor rapidly evolving genes and predispose our species'genome to non-allelic homologous recombination and disease. The long-term objective of our research has been to develop computational and experimental methods to understand the organization, diversity, and disease impact of segmental duplications. The goal of this competing renewal is to begin to understand the function and evolution of the duplicated genes themselves. We propose to focus here on 13 human- and great ape-specific gene families that have expanded within the last 15 million years of ape evolution. There are three aims: (1) Understand the genetic diversity and structure of these recent duplications by generating high-quality reference sequences using clone-based resources and long-read sequencing technologies;(2) Reconstruct gene family history by comparative sequencing of loci in great apes and exploring changes in gene structure, rates of substitution, and expression;and (3) Develop a robust genotyping assay based on molecular inversion probes to assess the genetic variation of these genes at the population level and the effect of forces such as non-allelic gene conversion. We hypothesize that segmental duplications have played an important role in human neurocognitive adaptation and that patterns of copy number polymorphisms and substitution will differ significantly between functional and nonfunctional paralogs. This research has the additional benefit that it will add new sequence to the human genome, identify missing genes, and provide us with the ability to systematically explore genetic variation of regions frequently overlooked as part of disease-association studies.
This proposal focuses on the genetic characterization of ~120 genes from 13 gene families within complex regions of duplication that have been difficult to sequence and assemble. The work will improve the quality of the genome, provide a fundamental understanding of how new genes arise, and develop a novel approach to rapidly assess genetic variation of these genes as they relate to human disease and evolutionary adaptation.
|Watson, C T; Steinberg, K M; Graves, T A et al. (2015) Sequencing of the human IG light chain loci from a hydatidiform mole BAC library reveals locus-specific signatures of genetic diversity. Genes Immun 16:24-34|
|Carbone, Lucia; Harris, R Alan; Gnerre, Sante et al. (2014) Gibbon genome and the fast karyotype evolution of small apes. Nature 513:195-201|
|Marmoset Genome Sequencing and Analysis Consortium (2014) The common marmoset genome provides insight into primate biology and evolution. Nat Genet 46:850-7|
|Huddleston, John; Ranade, Swati; Malig, Maika et al. (2014) Reconstructing complex regions of genomes using long-read sequencing technology. Genome Res 24:688-96|
|Antonacci, Francesca; Dennis, Megan Y; Huddleston, John et al. (2014) Palindromic GOLGA8 core duplicons promote chromosome 15q13.3 microdeletion and evolutionary instability. Nat Genet 46:1293-302|
|Nuttle, Xander; Itsara, Andy; Shendure, Jay et al. (2014) Resolving genomic disorder-associated breakpoints within segmental DNA duplications using massively parallel sequencing. Nat Protoc 9:1496-513|
|Lazaridis, Iosif; Patterson, Nick; Mittnik, Alissa et al. (2014) Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513:409-13|
|Prufer, Kay; Racimo, Fernando; Patterson, Nick et al. (2014) The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505:43-9|
|Hormozdiari, Fereydoun; Konkel, Miriam K; Prado-Martinez, Javier et al. (2013) Rates and patterns of great ape retrotransposition. Proc Natl Acad Sci U S A 110:13457-62|
|Giannuzzi, Giuliana; Siswara, Priscillia; Malig, Maika et al. (2013) Evolutionary dynamism of the primate LRRC37 gene family. Genome Res 23:46-59|
Showing the most recent 10 out of 51 publications