Several compelling questions in genome sequence analysis have been compromised by errors and gaps in the available genome assemblies. A telomere-to-telomere platinum-quality genome sequence of a human would open doors to investigating many problems associated with genetic disease, and development of platinum-quality model organism genomes will allow early exploration of the most efficient ways to pursue these questions. To demonstrate the utility of multiple reference-quality genomes, we have formulated several questions about genome evolution that make use of the Drosophila model system. These questions include (1) identifying new genes that originated within the Drosophila-specific clade, (2) estimating the rates of new gene evolution and examine the variation and constancy of those rates among Drosophila lineages, (3) quantifying rates and patterns of divergence of piRNA clusters, critical to host regulation of transposable elements, (4) analysis of sequence divergence in heterochromatic repeats, known to play key roles in centromere and telomere function as well as modulating chromatin states, and (5) analysis of Y chromosome gene and loss across the pan-Y chromosome. By obtaining and annotating reference-quality genome sequences of 19 Drosophila species spanning 40-60 MY of evolutionary history, using an efficient scheme that combines deep long-read (PacBio) assembly coupled with targeted sequencing of bacterial artificial chromosomes, we will produce a resource that will pave the way for the Drosophila community to tackle pressing hypothesis-driven questions in the field, including embryonic development, neurobiology, and aging ? all within a phylogenomics perspective.

Public Health Relevance

?Title: Reference-quality Drosophila genome assemblies for evolutionary analysis of previously inaccessible genomic regions? Project Narrative This project will develop and interrogate an essential set of high-quality genomes and study the evolution of new genes and previously inaccessible, yet biologically important, genomic regions that encompass 40-60 million years of functional adaptation. Our work will yield key infrastructure, approaches and insights to understand a broad range of biologically important phenomena related to human health, from genetics and embryonic development to neurobiology and other major biological research areas, all within a phylogenomics perspective.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
1R01GM116113-01A1
Application #
9176649
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Janes, Daniel E
Project Start
2016-09-08
Project End
2020-08-31
Budget Start
2016-09-08
Budget End
2017-08-31
Support Year
1
Fiscal Year
2016
Total Cost
$551,242
Indirect Cost
$171,155
Name
University of Arizona
Department
Other Basic Sciences
Type
Schools of Earth Sciences/Natur
DUNS #
806345617
City
Tucson
State
AZ
Country
United States
Zip Code
85721
VanKuren, Nicholas W; Long, Manyuan (2018) Gene duplicates resolving sexual conflict rapidly evolved essential gametogenesis functions. Nat Ecol Evol 2:705-712