The molecular process of adaptation-the rise in frequency of genetic variants that enable organisms to succeed in their environments-is a central process in evolutionary biology. Surmounting significant challenges such as the ability of infectious agents to evolve resistance to drugs and the ability of crop pests to defeat a diverse array of increasingly powerful insecticides requires an understanding of the nature of adaptation. Recent advances have demonstrated that adaptation often occurs via "soft selective sweeps," in which an adaptive genetic variant originates multiple times or has become favored only after it has been present at a substantial frequency in the population. This project contributes to advancing knowledge of the fundamental evolutionary process of adaptation by developing new computational tools to detect and study the occurrence of adaptation by soft selective sweeps. Through the interactions of a multidisciplinary team spanning evolutionary biology and bioinformatics, the project integrates advances in evolutionary simulation with modern and efficient computational methods in order to produce progress on understanding adaptation, while simultaneously developing efficient computational tools applicable in the modern "big-data" era of inexpensive sequencing. In addition, its joint mentorship efforts from evolutionary and bioinformatics perspectives promote interdisciplinary training of graduate students and postdoctoral scientists.

The project has four objectives: (1) To design new tests for detecting selection in the case in which soft selective sweeps occur from standing genetic variation; (2) To identify haplotypes that carry a beneficial allele in genomic regions known to be experiencing positive selection; (3) To enhance new methods of analysis of natural selection to make them robust to confounding demographic scenarios; (4) To apply new selection methods in a series of data sets from multiple species, including humans, Drosophila, and Plasmodium malaria parasites. The project will use algorithmic techniques from combinatorial optimization and machine learning, and it will exploit ideas from population genetics and coalescent theory. It breaks ground on several fronts, providing a deeper understanding of the patterns in site-frequency spectra and haplotype data as a basis for selection signatures, and assisting in the design of subtyping studies for complex regions of the genome. As it becomes increasingly possible to sequence whole genomes of multiple individuals within a population, the intellectual challenge of designing tools for detecting selection to accommodate new phenomena such as soft sweeps coincides with the computational challenge of incorporating genomic data sets into selection studies. These challenges are addressed by the project, whose results will be available at http://proteomics.ucsd.edu/vbafna/research-2/nsf1458059/.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Type
Standard Grant (Standard)
Application #
1458059
Program Officer
Peter McCartney
Project Start
Project End
Budget Start
2015-08-01
Budget End
2019-07-31
Support Year
Fiscal Year
2014
Total Cost
$574,954
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Stanford
State
CA
Country
United States
Zip Code
94305