Recombination and selection are two major evolutionary mechanisms that influence the pattern of variation in genomes. Efforts to deduce patterns of historical recombination are central to the design and analysis of disease association studies, and the ability to identify targets of selection may have important implications for biomedical research. The long-term objective of this application is to characterize quantitatively the effects of recombination and selection on genomic variation. Efficient algorithms and rigorous mathematical techniques will be developed for accurate inference in population genomics.
The specific aims of this application are:
Aim 1 : Develop methods to assess Monte Carlo approaches to likelihood computations in the coalescent with recombination. Deterministic, algorithm-based methods will be developed to compute likelihoods very accurately, opening up a new window of opportunities for testing and fine-tuning Monte Carlo approaches to likelihood computation. For input data of moderate size, the newly developed tool will be used to evaluate existing Monte Carlo methods.
Aim 2 : Develop methods to characterize historical crossover and gene-conversion recombinations. A general mathematical framework based on diffusion approximation will be developed to obtain accurate multi-locus conditional sampling distributions. Using that approach, a method that can jointly estimate crossover and gene-conversion rates will be developed. Further, existing estimation methods will be revisited and specific computational improvements will be made. :
Aim 3 : Study the interaction of natural selection at multiple loci. The interaction of selection at multiple loci will be studied analytically and the structure of LD shaped by interacting selection will be characterized. Fixation probabilities under multi-locus selection will also be studied. ? Relevance: Understanding the pattern of variation in the human genome is central to the study of the genetic basis of disease risk and variability in drug response.
The aim of this research is to develop accurate methods to characterize various evolutionary mechanisms that shape the pattern of genomic variation.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Transition Award (R00)
Project #
5R00GM080099-04
Application #
7750030
Study Section
Special Emphasis Panel (ZGM1-BRT-9 (KR))
Program Officer
Hagan, Ann A
Project Start
2006-12-01
Project End
2011-12-31
Budget Start
2010-01-01
Budget End
2011-12-31
Support Year
4
Fiscal Year
2010
Total Cost
$246,494
Indirect Cost
Name
University of California Berkeley
Department
Engineering (All Types)
Type
Schools of Engineering
DUNS #
124726725
City
Berkeley
State
CA
Country
United States
Zip Code
94704
Langley, Charles H; Stevens, Kristian; Cardeno, Charis et al. (2012) Genomic variation in natural populations of Drosophila melanogaster. Genetics 192:533-98
Song, Yun S; Steinrucken, Matthias (2012) A simple method for finding explicit analytic transition densities of diffusion processes with general diploid selection. Genetics 190:1117-29
Jenkins, Paul A; Griffiths, Robert C (2011) Inference from samples of DNA sequences using a two-locus model. J Comput Biol 18:109-27
Birkner, Matthias; Blath, Jochen; Steinrucken, Matthias (2011) Importance sampling for Lambda-coalescents in the infinitely many sites model. Theor Popul Biol 79:155-73
Song, Yun S; Wang, Fulton; Slatkin, Montgomery (2010) General epistatic models of the risk of complex diseases. Genetics 186:1467-73
Paul, Joshua S; Song, Yun S (2010) A principled approach to deriving approximate conditional sampling distributions in population genetics models with recombination. Genetics 186:321-38
Jenkins, Paul A; Song, Yun S (2010) AN ASYMPTOTIC SAMPLING FORMULA FOR THE COALESCENT WITH RECOMBINATION. Ann Appl Probab 20:1005-1028
Yin, Junming; Jordan, Michael I; Song, Yun S (2009) Joint estimation of gene conversion rates and mean conversion tract lengths from population SNP data. Bioinformatics 25:i231-9
Bhaskar, Anand; Song, Yun S (2009) Multi-locus match probability in a finite population: a fundamental difference between the Moran and Wright-Fisher models. Bioinformatics 25:i187-95
Song, Yun S; Patil, Anand; Murphy, Erin E et al. (2009) Average probability that a ""cold hit"" in a DNA database search results in an erroneous attribution. J Forensic Sci 54:22-7

Showing the most recent 10 out of 15 publications