Proteins are the workhorses of cells, comprising much of the machinery of life. Chemical changes due to co- or post-translational modifications, or amino acid substitutions resulting from genetic variation, can alter protein function and have significant consequences on the functioning of a cell. Pinpointing chemical changes in proteins in an automated manner remains an elusive goal. Mass spectrometry (MS) based methodologies are promising for examining such alterations, since they are exquisitely sensitive to the resulting shifts in mass. There are two main approaches that can be used for examining proteins by MS, one which measures the intact masses of proteins to detect shifts indicative of modifications (called top-down), and the other which enzymatically digests proteins into short peptides, then analyzes their chemical structure by tandem mass spectrometry (called bottom-up). Each of the existing MS methods has limitations, such as lack of complete protein coverage for bottom-up, and the inability to use top-down data to uniquely identify modifications;these drawbacks have motivated the development of hybrid combinations such as """"""""top-down bottom-up"""""""" (TDBU) proteomics. Though these are seeing a surge of interest, there is an acute lack of comprehensive, automated software for combining measurements from the distinct MS approaches;thus, studies to date have relied upon extensive manual analysis and/or ad hoc program scripts, inhibiting progress in the field. We propose to address this issue using our two existing programs, PROCLAME for analyzing top-down data, and GFS for analyzing bottom-up data, to develop integrated, open-source software that combines data from multiple MS methodologies to pinpoint posttranslational modifications and amino acid substitutions in proteins.
Our aims are: 1) to integrate multiple MS data sources for determining the type and location of modifications on proteins, by adding a Markov chain Monte Carlo (MCMC) based engine to PROCLAME;2) to improve the ability to analyze bottom-up data by enhancing GFS for the automatic determination of posttranslational modifications;3) to manage and integrate results from multiple MS measurements and search engines, by developing a database system and scripts to tie the programs together;and 4) to assure program reliability and suitability through both alpha testing in-house and beta testing at external sites. Health Relevance: Both amino acid substitutions and misregulation of enzymes that modify proteins play roles in human diseases such as Cancer, Diabetes, Sickle Cell Anemia, and many others. This proposal is to build generalized software that can be used by a broad base of researchers to pinpoint the chemical changes/modifications to proteins that perturb regulatory networks in cells to cause disease.NARRATIVE Both amino acid substitutions and misregulation of enzymes that modify proteins play roles in human diseases such as Cancer, Diabetes, Sickle Cell Anemia, and many others. This proposal is to build generalized software that can be used by a broad base of researchers to pinpoint the chemical changes and modifications to proteins that perturb regulatory networks in cells to cause disease, by integrating data from the latest proteomic technologies.

Agency
National Institute of Health (NIH)
Institute
National Center for Research Resources (NCRR)
Type
Research Project (R01)
Project #
5R01RR020823-07
Application #
7996056
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Sheeley, Douglas
Project Start
2004-09-24
Project End
2011-01-31
Budget Start
2010-12-01
Budget End
2011-01-31
Support Year
7
Fiscal Year
2011
Total Cost
$34,323
Indirect Cost
Name
University of North Carolina Chapel Hill
Department
Microbiology/Immun/Virology
Type
Schools of Medicine
DUNS #
608195277
City
Chapel Hill
State
NC
Country
United States
Zip Code
27599
Stiegelmeyer, Suzy M; Giddings, Morgan C (2013) Agent-based modeling of competence phenotype switching in Bacillus subtilis. Theor Biol Med Model 10:23
Jefferys, Stuart R; Giddings, Morgan C (2011) Baking a mass-spectrometry data PIE with McMC and simulated annealing: predicting protein post-translational modifications from integrated top-down and bottom-up data. Bioinformatics 27:844-52
Miller, Jameson; Parker, Miles; Bourret, Robert B et al. (2010) An agent-based model of signal transduction in bacterial chemotaxis. PLoS One 5:e9454
Su, Hsun-Cheng; Ramkissoon, Kevin; Doolittle, Janet et al. (2010) The development of ciprofloxacin resistance in Pseudomonas aeruginosa involves multiple response stages and multiple proteins. Antimicrob Agents Chemother 54:4626-35
Khatun, Jainab; Hamlett, Eric; Giddings, Morgan C (2008) Incorporating sequence information into the scoring function: a hidden Markov model for improved peptide identification. Bioinformatics 24:674-81
Yang, Dongmei; Ramkissoon, Kevin; Hamlett, Eric et al. (2008) High-accuracy peptide mass fingerprinting using peak intensity data with machine learning. J Proteome Res 7:62-9
Khatun, Jainab; Ramkissoon, Kevin; Giddings, Morgan C (2007) Fragmentation characteristics of collision-induced dissociation in MALDI TOF/TOF mass spectrometry. Anal Chem 79:3032-40
Su, Hsun-Cheng; Hutchison 3rd, Clyde A; Giddings, Morgan C (2007) Mapping phosphoproteins in Mycoplasma genitalium and Mycoplasma pneumoniae. BMC Microbiol 7:63
Crayton 3rd, Mack E; Powell, Bradford C; Vision, Todd J et al. (2006) Tracking the evolution of alternatively spliced exons within the Dscam family. BMC Evol Biol 6:16
Wisz, Michael S; Suarez, Melissa Kimball; Holmes, Mark R et al. (2004) GFSWeb: a web tool for genome-based identification of proteins from mass spectrometric samples. J Proteome Res 3:1292-5