This project will to generate a community resource for Arabidopsis genetics by integrating several new data sets with the existing A. thaliana genome sequence and those of two closely related species, A. lyrata and Capsella rubella. These data will make it possible to answer fundamental questions about the evolution of A. thaliana, and will facilitate the interpretation of the existing genome sequence. It is analogous to similar efforts to understand human variation and evolution by integrating vast amounts of human polymorphism data with genome sequences from closely related primates like chimpanzee and orangutan. This project will provide a tool for making sense of genetic variation. Understanding how genotypic variation translates into phenotypic variation and how it is structured in populations is fundamental to our understanding of evolution, and has enormous practical implications for human health as well as for plant and animal breeding. The project will also provide training opportunities in bioinformatics and computational biology for multiple students at different levels, areas where there is a nationally recognized shortage of biology students with the requisite quantitative skills.