? One of the significant limitations of current proteomics software is its reliance upon databases of previously identified genes or proteins, in order to identify the protein(s) present in a sample analyzed by mass spectrometry (MS). This substantially restricts proteomics research to the study of organisms for which annotation is complete and accurate. With a large number of new, draft sequences coming on line (e.g. Tetrahymena and Honeybee), and the incomplete nature of annotation for the human genome, this presents a significant bottleneck. We developed the Genome Fingerprint Scanning (GFS) program to address this limitation. It is unique in that it can match MS data (and MS/MS data) directly to raw, un-annotated sequence to identify proteins and locate novel genes. It has been used successfully to identify previously uncharacterized proteins and genes within Tetrahymena, Francisella tularensis, and the poxviruses. There is growing interest in using it for proteomic analysis of a range of diverse organisms, from adenoviruses to Arabidopsis thaliana. This proposal focuses on further developing GFS to transform it into a robust, freely available, and more widely used tool for proteomics research. Our alms are to complete a comprehensive web site for public use of GFS supported by our local computing resources; to enhance the GFS algorithms for improved speed, reliability, and the ability to match protein data to multi-exon genes; and to port the code for use on Windows and popular Unix platforms, providing thorough administrator, developer, and end-user documentation and support. ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Center for Research Resources (NCRR)
Type
Research Project (R01)
Project #
5R01RR020823-03
Application #
7119605
Study Section
Special Emphasis Panel (ZRG1-BST-D (51))
Program Officer
Sheeley, Douglas
Project Start
2004-09-24
Project End
2007-12-14
Budget Start
2006-09-01
Budget End
2007-12-14
Support Year
3
Fiscal Year
2006
Total Cost
$354,855
Indirect Cost
Name
University of North Carolina Chapel Hill
Department
Microbiology/Immun/Virology
Type
Schools of Medicine
DUNS #
608195277
City
Chapel Hill
State
NC
Country
United States
Zip Code
27599
Stiegelmeyer, Suzy M; Giddings, Morgan C (2013) Agent-based modeling of competence phenotype switching in Bacillus subtilis. Theor Biol Med Model 10:23
Jefferys, Stuart R; Giddings, Morgan C (2011) Baking a mass-spectrometry data PIE with McMC and simulated annealing: predicting protein post-translational modifications from integrated top-down and bottom-up data. Bioinformatics 27:844-52
Miller, Jameson; Parker, Miles; Bourret, Robert B et al. (2010) An agent-based model of signal transduction in bacterial chemotaxis. PLoS One 5:e9454
Su, Hsun-Cheng; Ramkissoon, Kevin; Doolittle, Janet et al. (2010) The development of ciprofloxacin resistance in Pseudomonas aeruginosa involves multiple response stages and multiple proteins. Antimicrob Agents Chemother 54:4626-35
Khatun, Jainab; Hamlett, Eric; Giddings, Morgan C (2008) Incorporating sequence information into the scoring function: a hidden Markov model for improved peptide identification. Bioinformatics 24:674-81
Yang, Dongmei; Ramkissoon, Kevin; Hamlett, Eric et al. (2008) High-accuracy peptide mass fingerprinting using peak intensity data with machine learning. J Proteome Res 7:62-9
Khatun, Jainab; Ramkissoon, Kevin; Giddings, Morgan C (2007) Fragmentation characteristics of collision-induced dissociation in MALDI TOF/TOF mass spectrometry. Anal Chem 79:3032-40
Su, Hsun-Cheng; Hutchison 3rd, Clyde A; Giddings, Morgan C (2007) Mapping phosphoproteins in Mycoplasma genitalium and Mycoplasma pneumoniae. BMC Microbiol 7:63
Crayton 3rd, Mack E; Powell, Bradford C; Vision, Todd J et al. (2006) Tracking the evolution of alternatively spliced exons within the Dscam family. BMC Evol Biol 6:16
Wisz, Michael S; Suarez, Melissa Kimball; Holmes, Mark R et al. (2004) GFSWeb: a web tool for genome-based identification of proteins from mass spectrometric samples. J Proteome Res 3:1292-5