The genomes of individuals alive today are derived from some common ancestor(s) by the actions of mutations and meiotic recombinations. Recombination mixes two homologous chromosomes in an individual to produce a third recombinant, mosaic chromosome, consisting of alternating segments of the two homologs. A recombinant chromosome is then passed on to a child of the individual. Therefore, the derivation of genomes in a current population, from some ancestral genome(s), cannot be represented by a tree, but rather must be represented by a directed acyclic graph, called a Phylogenetic Network or Genealogical Network, or Ancestral Recombination Graph (ARG). Explicitly knowing the historically correct genealogical network that derived extant genomes, or knowing critical features of the network, would greatly facilitate the solution of fundamental problems in biology, and has important practical applications, for example in association mapping in populations, a technique to find genes affecting diseases and important economic traits. However, since we cannot directly examine the past, we must computationally deduce a genealogical network, or features of it, from genomes that we can examine in populations today. The development of algorithms for such computation requires significant interaction of ideas from biology, computer science, graph-theory, mathematics, and algorithm and software engineering. This interdisciplinary project, conducted by computer scientists in collaboration with a population geneticist, is focused on developing efficient algorithms to infer and exploit complex genealogical histories under a variety of biological models of the evolution of genomic sequences and of genetic traits, using different types of existing and emerging biological data. These algorithms will be implemented in software that can be used to study fundamental biological questions, and applied to practical problems such as association mapping of complex traits. The central thesis of the project is that explicit genealogical networks can be efficiently computed, and that these networks capture enough of the true history (even if the networks don?t capture all of it) to allow researchers to more effectively answer fundamental biological questions, and more effectively solve practical biological problems. The project also addresses fundamentally new algorithmic problems and biological applications, driven by new kinds of population variation data that are becoming available, new areas of biology where population data is becoming available, new biological models that have been recently proposed for the evolution of sequences and genetic traits in populations, new understanding of different genomic variations that affect important traits, and biological controversies and questions about the nature (and even the existence) of recombination, and about its role in other biological phenomena. This work will contribute to algorithmic computer science and also to several areas of biology, particularly population genetics. The algorithms and software that will be developed will allow biologists to deduce complex genealogical histories, to better understand the role of recombination, and to address both fundamental biological problems and applied practical problems. The software will be disseminated on the web, along with slides and videos of lectures on the algorithms underlying the software. The project will allow the joint mentoring of graduate students and post-doctoral researchers by advisers from both computer science and biology. The participation of researchers from both computer science and biology makes the research more visible to their respective communities, encouraging other interdisciplinary research.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
0803564
Program Officer
Sylvia J. Spengler
Project Start
Project End
Budget Start
2008-09-01
Budget End
2013-08-31
Support Year
Fiscal Year
2008
Total Cost
$602,791
Indirect Cost
Name
University of California Davis
Department
Type
DUNS #
City
Davis
State
CA
Country
United States
Zip Code
95618