This research program pertains to the development of tools and resources for the plant biology community. Computational studies of the genome sequence of the reference plant Arabidopsis thaliana have identified approximately 30,000 genes. In order to carry out a variety of functional genomics and proteomics applications, it is essential to identify all of the genes and determine their transcription unit structures. This project will utilize a newly developed single chip whole genome tiling array to experimentally map the transcription units in the Arabidopsis genome. The transcription unit mapping information will be used to amplify and clone, in recombination-based vector, 6,000 full-length (FL) cDNA and open-reading-frame (ORF) clones. The DNA sequences of each clone will be determined to high accuracy and this information can be used to improve the genome annotation. The construction of an error-free ORF clone for each protein-coding gene will enable a variety of functional genomics and proteomics studies. All cDNA/ORF clones will be deposited with the Arabidopsis Biological Resource Center and the clone sequences will be will be deposited in GenBank. DNA sequence and tiling array hybridization data will also be displayed on the SIGnAL project web site: http://signal.salk.edu.
Broader Impacts: The beneficiaries of this research program include the entire plant biology community. The transcription unit DNA sequence information and cDNA/ORF clones produced by this project will provide investigators with essential information necessary to elucidate the functions of the Arabidopsis proteome. The collection of Arabidopsis ORF clones will enable the construction of whole genome protein arrays, the development of protein-protein interaction maps and the ability to rapidly create plants that ectopically express any ORF using any promoter of choice. The long-term impact of these enabling tools and technologies on agriculture is expected to be profound, providing fundamental knowledge for the construction of plants with superior agronomic traits. Importantly, all of the ORF clones, array data and DNA sequences will be made freely available to the research community. Finally, an important feature of the program is the training of high school and undergraduate students in bioinformatics and functional genomic methodologies.