Despite the high quality of the human genome, important gaps remain in our understanding of its sequence organization, function, and variation. Our genome is particularly enriched for interspersed segmental duplications, which harbor rapidly evolving genes and predispose our species to recurrent rearrangements associated with disease. The long-term objective of our research has been to develop computational and experimental methods to understand the organization, genetic diversity, and disease impact of segmental duplications. The goal of this competing renewal is to begin to understand the function and variation of the duplicated genes themselves. We propose to focus here on human- and great ape-specific gene families mapping within the most complex and duplicated regions of our genome. There are four aims: (1) determine the sequence structure of these recent duplications by generating high-quality reference sequences using clone-based resources and long-read sequencing technologies; (2) understand the genetic diversity of this structure focusing on those that have most likely been targets of selection; (3) completely annotate the gene content to distinguish protein-encoding innovations from pseudogenes; and (4) test for neurodevelopmental disease association by comparing the burden of loss-of-function mutations in patients versus controls using available genome sequence data and molecular inversion probe assays. We hypothesize that segmental duplications have played an important role in human neurocognitive adaptation and that patterns of copy number polymorphisms and substitution will differ significantly between functional and nonfunctional paralogs. This research has the additional benefit that it will add new sequence to reference genomes, identify missing genes, and provide us with the ability to systematically explore genetic variation of regions frequently overlooked as part of disease-association studies.
This proposal focuses on characterizing the sequence structure, variation, and annotation of protein-coding genes within complex regions of duplication that have been difficult to sequence and assemble. The work will improve the quality of reference genome, provide a fundamental understanding of how new genes arise, and develops a novel approach to rapidly assess genetic variation of these genes as they relate to human disease and adaptation.
Kronenberg, Zev N; Fiddes, Ian T; Gordon, David et al. (2018) High-resolution comparative analysis of great ape genomes. Science 360: |
Catacchio, Claudia Rita; Maggiolini, Flavia Angela Maria; D'Addabbo, Pietro et al. (2018) Inversion variants in human and primate genomes. Genome Res 28:910-920 |
Fiddes, Ian T; Lodewijk, Gerrald A; Mooring, Meghan et al. (2018) Human-Specific NOTCH2NL Genes Affect Notch Signaling and Cortical Neurogenesis. Cell 173:1356-1369.e22 |
Cantsilieris, Stuart; Nelson, Bradley J; Huddleston, John et al. (2018) Recurrent structural variation, clustered sites of selection, and disease risk for the complement factor H (CFH) gene family. Proc Natl Acad Sci U S A 115:E4433-E4442 |
Dougherty, Max L; Underwood, Jason G; Nelson, Bradley J et al. (2018) Transcriptional fates of human-specific segmental duplications in brain. Genome Res 28:1566-1576 |
Chaisson, Mark J; Mukherjee, Sudipto; Kannan, Sreeram et al. (2017) Resolving multicopy duplications de novo using polyploid phasing. Res Comput Mol Biol 10229:117-133 |
Dennis, Megan Y; Harshman, Lana; Nelson, Bradley J et al. (2017) The evolution and population diversity of human-specific segmental duplications. Nat Ecol Evol 1:69 |
Chiatante, Giorgia; Giannuzzi, Giuliana; Calabrese, Francesco Maria et al. (2017) Centromere Destiny in Dicentric Chromosomes: New Insights from the Evolution of Human Chromosome 2 Ancestral Centromeric Region. Mol Biol Evol 34:1669-1681 |
Kuderna, Lukas F K; Tomlinson, Chad; Hillier, LaDeana W et al. (2017) A 3-way hybrid approach to generate a new high-quality chimpanzee reference genome (Pan_tro_3.0). Gigascience 6:1-6 |
Dougherty, Max L; Nuttle, Xander; Penn, Osnat et al. (2017) The birth of a human-specific neural gene by incomplete duplication and gene fusion. Genome Biol 18:49 |
Showing the most recent 10 out of 86 publications