Structural variations (SVs) are defined as medium and large genome rearrangements. A growing body of evidence has shown that SVs are a major contributing factor to diseases, complex traits, population genomics, and evolution. However, there are many unknowns about SVs including their diversity, complexity, distribution in a population, and exact impact in biology. The recent progress on genome technologies, especially high-throughput sequencing technologies, has provided an opportunity to investigate the complexity of SVs in genomes. However, a lack of computational approaches for efficient discovery and genotyping of different types of (complex) SVs has hindered our ability to comprehensively study the complexity and diversity of SVs in genomes. The goal of this project is to develop novel combinatorial methods to provide researchers with necessary tools to better capture the diversity of SVs and their potential biological impact. The results of this research will have application in a wide range of foci in genomics, from evolution to disease. This project will also achieve broader impact by providing training opportunities for both undergraduate and graduate students interested in computational genomics.

This project seeks to develop novel computational methods to address some of the main challenges in studying SVs. As part of this project, novel combinatorial methods will be developed for efficient and accurate genotyping of any SV using ever changing sequencing technologies. This project will provide researchers with the necessary tools for ultra-efficient genotyping of a set of polymorphic SVs in a large cohort of sequenced samples using short-read sequencing technologies. Furthermore, novel mapping-free approaches for comparative SV discovery using long-read sequencing data will be developed. This will provide the necessary methods for studying the diverse set of SVs (including hard to detect and complex SVs) in sequenced samples of any species using these technologies. A combinatorial approach will also be developed to predict the functional impact of SVs by altering the chromatin structure of the genome. Finally, to establish the utility of these methods, these investigators will analyze publicly available data from diverse sets of species using the methods developed. The results of the projects will be available at

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

National Science Foundation (NSF)
Division of Biological Infrastructure (DBI)
Application #
Program Officer
Peter McCartney
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California Davis
United States
Zip Code