? One of the significant limitations of current proteomics software is its reliance upon databases of previously identified genes or proteins, in order to identify the protein(s) present in a sample analyzed by mass spectrometry (MS). This substantially restricts proteomics research to the study of organisms for which annotation is complete and accurate. With a large number of new, draft sequences coming on line (e.g. Tetrahymena and Honeybee), and the incomplete nature of annotation for the human genome, this presents a significant bottleneck. We developed the Genome Fingerprint Scanning (GFS) program to address this limitation. It is unique in that it can match MS data (and MS/MS data) directly to raw, un-annotated sequence to identify proteins and locate novel genes. It has been used successfully to identify previously uncharacterized proteins and genes within Tetrahymena, Francisella tularensis, and the poxviruses. There is growing interest in using it for proteomic analysis of a range of diverse organisms, from adenoviruses to Arabidopsis thaliana. This proposal focuses on further developing GFS to transform it into a robust, freely available, and more widely used tool for proteomics research. Our alms are to complete a comprehensive web site for public use of GFS supported by our local computing resources; to enhance the GFS algorithms for improved speed, reliability, and the ability to match protein data to multi-exon genes; and to port the code for use on Windows and popular Unix platforms, providing thorough administrator, developer, and end-user documentation and support. ? ? ?