Despite the high quality of the human genome, important gaps remain in our understanding of its sequence organization, function, and variation. Our genome is particularly enriched for interspersed segmental duplications, which harbor rapidly evolving genes and predispose our species to recurrent rearrangements associated with disease. The long-term objective of our research has been to develop computational and experimental methods to understand the organization, genetic diversity, and disease impact of segmental duplications. The goal of this competing renewal is to begin to understand the function and variation of the duplicated genes themselves. We propose to focus here on human- and great ape-specific gene families mapping within the most complex and duplicated regions of our genome. There are four aims: (1) determine the sequence structure of these recent duplications by generating high-quality reference sequences using clone-based resources and long-read sequencing technologies; (2) understand the genetic diversity of this structure focusing on those that have most likely been targets of selection; (3) completely annotate the gene content to distinguish protein-encoding innovations from pseudogenes; and (4) test for neurodevelopmental disease association by comparing the burden of loss-of-function mutations in patients versus controls using available genome sequence data and molecular inversion probe assays. We hypothesize that segmental duplications have played an important role in human neurocognitive adaptation and that patterns of copy number polymorphisms and substitution will differ significantly between functional and nonfunctional paralogs. This research has the additional benefit that it will add new sequence to reference genomes, identify missing genes, and provide us with the ability to systematically explore genetic variation of regions frequently overlooked as part of disease-association studies.

Public Health Relevance

This proposal focuses on characterizing the sequence structure, variation, and annotation of protein-coding genes within complex regions of duplication that have been difficult to sequence and assemble. The work will improve the quality of reference genome, provide a fundamental understanding of how new genes arise, and develops a novel approach to rapidly assess genetic variation of these genes as they relate to human disease and adaptation.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Washington
Schools of Medicine
United States
Zip Code
Kronenberg, Zev N; Fiddes, Ian T; Gordon, David et al. (2018) High-resolution comparative analysis of great ape genomes. Science 360:
Catacchio, Claudia Rita; Maggiolini, Flavia Angela Maria; D'Addabbo, Pietro et al. (2018) Inversion variants in human and primate genomes. Genome Res 28:910-920
Fiddes, Ian T; Lodewijk, Gerrald A; Mooring, Meghan et al. (2018) Human-Specific NOTCH2NL Genes Affect Notch Signaling and Cortical Neurogenesis. Cell 173:1356-1369.e22
Cantsilieris, Stuart; Nelson, Bradley J; Huddleston, John et al. (2018) Recurrent structural variation, clustered sites of selection, and disease risk for the complement factor H (CFH) gene family. Proc Natl Acad Sci U S A 115:E4433-E4442
Dougherty, Max L; Underwood, Jason G; Nelson, Bradley J et al. (2018) Transcriptional fates of human-specific segmental duplications in brain. Genome Res 28:1566-1576
Chiatante, Giorgia; Giannuzzi, Giuliana; Calabrese, Francesco Maria et al. (2017) Centromere Destiny in Dicentric Chromosomes: New Insights from the Evolution of Human Chromosome 2 Ancestral Centromeric Region. Mol Biol Evol 34:1669-1681
Kuderna, Lukas F K; Tomlinson, Chad; Hillier, LaDeana W et al. (2017) A 3-way hybrid approach to generate a new high-quality chimpanzee reference genome (Pan_tro_3.0). Gigascience 6:1-6
Dougherty, Max L; Nuttle, Xander; Penn, Osnat et al. (2017) The birth of a human-specific neural gene by incomplete duplication and gene fusion. Genome Biol 18:49
Schneider, Valerie A; Graves-Lindsay, Tina; Howe, Kerstin et al. (2017) Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res 27:849-864
Tolomeo, Doron; Capozzi, Oronzo; Stanyon, Roscoe R et al. (2017) Epigenetic origin of evolutionary novel centromeres. Sci Rep 7:41980

Showing the most recent 10 out of 86 publications