Genomic structural variants (SV) involving deletions, duplications, insertions, inversions, and translocations of sequences are an abundant source of genetic variation. SVs have been linked to Mendelian diseases, as well as complex heritable diseases like schizophrenia, and cancer. However, recent comparisons of extremely contiguous genome assemblies of humans and model organism Drosophila melanogaster have revealed that common genotyping strategies relying on high throughput short reads miss 40-80% of SVs, including those affecting phenotypes. Thus, contribution of SVs towards diseases and phenotypic variation remain grossly underestimated. To accurately measure the contribution of SVs towards deleterious genetic variation and trait variation, we propose to create a comprehensive map of genomewide SVs via comparison of extremely contiguous genome assemblies. However, contiguous de novo assembly of human genomes with high coverage (>50X) noisy long reads remains prohibitively expensive. So I propose to analyze SVs in the 25-fold smaller genome of model organism D. melanogaster, which has contributed substantially to our understanding of the genetics of complex human diseases. The proposed research aims to study fitness effects of polymorphic SVs based on de novo genome assemblies of 50 genetically diverse D. melanogaster strains that are as complete and contiguous as the current D. melanogaster reference genome ? arguably the best metazoan genome assembly (Aim 1). I propose to use this comprehensive set of variants to infer the distribution of fitness effects of the SVs and to estimate the proportion of adaptive SVs, both of which are reliable proxies for the evolutionary and functional significance of SVs (Aim 1).
Aim 1 will involve training in theory and cutting edge methods in molecular population genetics. Next, the proposed project will develop an experimental approach to determine the fitness effects of variants for which an organismal phenotype is unknown. As part of this, the proposed project will develop genome editing resources that will facilitate rapid transformation of one of our sequenced strains with SVs, so that fitness effects of candidate SVs from trait mapping studies can be examined (Aim 2). Training in Aim 2 includes development of CRISPR-Cas9 toolkit in a common genetic background to investigate the functional effects of SVs. Finally, using the toolkit developed in Aim 2, we propose to conduct high throughput fitness assays to evaluate the selective effects of SVs under specific selection conditions (Aim 3). The training portion of the proposed research will complement the applicant?s previous experience and position him for a successful research career. University of California Irvine and the Emerson and Long labs together have the resources and expertise to ensure the successful completion of the training phase of the grant.

Public Health Relevance

Changes in the genome structure due to copying, deletion, rearrangement, or otherwise reorganization of sequences is a major source of genetic variation in all organisms, including humans. Although structural variants cause many heritable diseases and underlie phenotypic changes, 40-80% of this genetic variation remains hidden to the common genotyping strategies. Here, we propose to create a comprehensive catalog of polymorphic structural genetic variation using 50 de novo platinum genome assemblies of genetically diverse D. melanogaster strains and investigate their fitness consequences using population genetic theories and experimental fitness assays.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Career Transition Award (K99)
Project #
1K99GM129411-01A1
Application #
9744181
Study Section
Special Emphasis Panel (ZGM1)
Program Officer
Sesma, Michael A
Project Start
2019-04-01
Project End
2021-03-31
Budget Start
2019-04-01
Budget End
2021-03-31
Support Year
1
Fiscal Year
2019
Total Cost
Indirect Cost
Name
University of California Irvine
Department
Biology
Type
Schools of Arts and Sciences
DUNS #
046705849
City
Irvine
State
CA
Country
United States
Zip Code
92617