Intellectual Merit: This project aims to develop genome-wide experimental tools and technologies for analyzing gene regulation and function in Arabidopsis. Computational analysis of the initial genome sequence of Arabidopsis thaliana, completed in the year 2000, provided evidence for the existence of approximately 25,500 protein-coding genes. Intensive efforts during the past ten years to experimentally annotate this genome sequence have now identified 31,128 genes, which include additional protein-coding genes as well as non-coding RNA genes. The identification of T-DNA insertion mutations in all Arabidopsis genes has been an on-going aim of worldwide functional genomics programs. Analysis of the available set of TDNA mutants reveals that ~24,000 additional mutant alleles are needed to create a comprehensive homozygous mutant collection. This research specifically addresses the following goals: to identify two genetically stable loss-of-function mutations in all Arabidopsis genes and to complete the goal of isolating two homozygous alleles for every gene in the genome. llumina paired-end deep sequencing will be used to identify mutations in these "missing" genes, thereby allowing completion of the "unimutant" for the annotated set of Arabidopsis genes. Further development of the T-DNA-Seq method for large scale capture and sequencing of ~ one million T-DNA insertion sites will allow identification of the inventory of essential genes as well as provide insights into the mechanism of T-DNA integration and associated gene silencing events.
Broader Impacts: The genomic resources developed by this project will be widely available to a large number of researchers and will provide the basis for a variety of research projects that rely upon whole genome information. Completion of the proposed research will provide a new important resource for the plant biology community, enabling a variety of genome-wide mutant screens for any visible phenotype of interest. An important feature of this research is that all of the mutant plants/populations will be available to the research community as soon as they are produced. The beneficiaries of this program will be the entire plant biology community, providing essential reagents necessary to elucidate the functions of the Arabidopsis genes. Additionally, the new technology developed to rapidly and inexpensively index insertion mutants by next generation sequencing in any plant or animal will have applications far beyond Arabidopsis research. The long-term impact of these enabling tools and technologies on agriculture is expected to be profound, providing fundamental knowledge for the construction of new plant varieties with superior agronomic traits. An equally important aspect of our research program is the hands-on training in plant genome research that will be provided at a variety of levels, including outreach to minority high school and undergraduate students.