Biological research is increasingly dependent upon """"""""finished"""""""" genome sequence as a baseline for further research. More than 99% of the targeted human genome is now represented as high quality finished sequence with each base ordered and orientated. Two major types of gaps remain: heterochromatic (estimated at _190 Mb) and euchromatic gaps (23.0 Mb). Within euchromatic regions 54.5% (168/308) of all assembly gaps are flanked by segmental duplication. The greatest gap density within the finished genome occurs within 2 Mb transition regions between the centromere and euchromatin DNA. We propose that duplications and large-scale structural variation have complicated sequence and assembly of these regions creating de facto gaps. This grant outlines a systematic strategy to target the sequence and assembly of pericentromeric DNA using genomic libraries of haploid complexity. Comparative sequence analysis of one pericentromeric region among primates will serve as a model to understand the pattern of structural variation as a function of evolutionary time. In addition, this competitive renewal develops a computational pipeline that provides support for the analysis of duplication content within other mammalian genomes. The results of this analysis will provide a framework for understanding these regions in other organisms as well as complement ongoing NHGRI-approved whole-genome shotgun sequencing efforts. The presence of recent segmental duplications remains the single most important predictor of gap location within euchromatic sequence. The resolution of these exceptional regions is, therefore, critical for accurate assembly and annotation of genomes. ? ?

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG002385-06
Application #
7123963
Study Section
Special Emphasis Panel (ZRG1-GTIE (02))
Program Officer
Schloss, Jeffery
Project Start
2001-09-21
Project End
2007-07-31
Budget Start
2006-08-01
Budget End
2007-07-31
Support Year
6
Fiscal Year
2006
Total Cost
$372,217
Indirect Cost
Name
University of Washington
Department
Genetics
Type
Schools of Medicine
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195
Kronenberg, Zev N; Fiddes, Ian T; Gordon, David et al. (2018) High-resolution comparative analysis of great ape genomes. Science 360:
Catacchio, Claudia Rita; Maggiolini, Flavia Angela Maria; D'Addabbo, Pietro et al. (2018) Inversion variants in human and primate genomes. Genome Res 28:910-920
Fiddes, Ian T; Lodewijk, Gerrald A; Mooring, Meghan et al. (2018) Human-Specific NOTCH2NL Genes Affect Notch Signaling and Cortical Neurogenesis. Cell 173:1356-1369.e22
Cantsilieris, Stuart; Nelson, Bradley J; Huddleston, John et al. (2018) Recurrent structural variation, clustered sites of selection, and disease risk for the complement factor H (CFH) gene family. Proc Natl Acad Sci U S A 115:E4433-E4442
Dougherty, Max L; Underwood, Jason G; Nelson, Bradley J et al. (2018) Transcriptional fates of human-specific segmental duplications in brain. Genome Res 28:1566-1576
Chiatante, Giorgia; Giannuzzi, Giuliana; Calabrese, Francesco Maria et al. (2017) Centromere Destiny in Dicentric Chromosomes: New Insights from the Evolution of Human Chromosome 2 Ancestral Centromeric Region. Mol Biol Evol 34:1669-1681
Kuderna, Lukas F K; Tomlinson, Chad; Hillier, LaDeana W et al. (2017) A 3-way hybrid approach to generate a new high-quality chimpanzee reference genome (Pan_tro_3.0). Gigascience 6:1-6
Dougherty, Max L; Nuttle, Xander; Penn, Osnat et al. (2017) The birth of a human-specific neural gene by incomplete duplication and gene fusion. Genome Biol 18:49
Schneider, Valerie A; Graves-Lindsay, Tina; Howe, Kerstin et al. (2017) Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res 27:849-864
Tolomeo, Doron; Capozzi, Oronzo; Stanyon, Roscoe R et al. (2017) Epigenetic origin of evolutionary novel centromeres. Sci Rep 7:41980

Showing the most recent 10 out of 86 publications