In humans and many other species, a majority of the genome consists of repeated DNA sequences that arose via duplications. A genetic duplication can occur when an offspring inherits two copies of a DNA sequence from a parent who only has one copy; these duplicated regions in the genome are called repeats. Following duplication events, these repeats can subsequently diverge due to mutations that affect one repeat but not others. However, this divergence can be partially or completely counteracted by a phenomenon known as interlocus gene conversion (IGC). IGC increases the similarity of repeated sequences by copying sequence stretches from one repeat copy into the corresponding region of another. The role of IGC mutations in shaping how repeated genomic regions evolve through time is poorly understood. This project aims to develop statistical methods for quantifying and incorporating the impact that IGC has in shaping genomes. By doing so, this research will illuminate the forces that shape repeated regions in genomes and will assist in understanding the biological function of these regions. Project goals will be accomplished with the assistance of undergraduate researchers. These students will have the opportunity to visit collaborators at the University of Tokyo in a cross-cultural exchange. In addition, visualizations will be created to simultaneously teach the general public about statistical inference and the forces that shape genomes.

A variety of likelihood-based statistical tools for studying IGC will be developed and assessed. One goal is to learn how IGC varies among biological species and among genes. A focus will be on how IGC is influenced by whole genome versus tandem duplication. The research will investigate whether there are tendencies of some repeat copies to be donors and other repeat copies to be recipients during IGC events. Careful consideration will be made of how to accommodate IGC in detection of positive diversifying selection, divergence time estimation, and ancestral sequence reconstruction. The work will combine theoretical development, software implementation, and testing using a combination of simulation and empirical data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Environmental Biology (DEB)
Type
Standard Grant (Standard)
Application #
1754142
Program Officer
Leslie J. Rissler
Project Start
Project End
Budget Start
2018-08-01
Budget End
2022-07-31
Support Year
Fiscal Year
2017
Total Cost
$564,338
Indirect Cost
Name
North Carolina State University Raleigh
Department
Type
DUNS #
City
Raleigh
State
NC
Country
United States
Zip Code
27695