The problems of peptide identification and protein identification are of fundamental importance in proteomics. We propose to study a new multi-pronged framework for the de novo peptide sequencing and protein identification under uncertainty problems using tandem mass spectroscopy. The proposed approach will be based on fundamental advances in mathematical modeling via mixed integer optimization, as well as theory and algorithms for optimization under uncertainty. We expect that significant advances will be introduced in theory and algorithmic enhancements. We put forward the following four specific aims:
Specific Aim 1 : Investigate and develop a novel de novo computational approach for the peptide identification based on information of the ion peaks in the peptide spectrum and a mixed-integer optimization modeling and algorithmic framework.
Specific Aim 2 : Investigate novel de novo methods for the identification of peptides in complex protein mixtures which will account for experimental uncertainty in the calculation of the mass/charge ratios of the ion peaks.
Specific Aim 3 : Study and develop a new hybrid in silico method which will combine the de novo approach of Specific Aim 1 with database methods for the peptide identification.
Specific Aim 4 : Investigate and develop a new approach for the protein identification which will combine the advances in Specific Aims 1-3 with database homology based methods. Preliminary studies are reported in Specific Aims 1, 2, and 3 (sections C.1, C.2, D.1, D.2, D.3), and the results, via comparative studies and computational efficiency, demonstrate the potential of the proposed research for high throughput peptide and protein identification.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
5R01LM009338-04
Application #
7835817
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
2007-05-01
Project End
2013-04-30
Budget Start
2010-05-01
Budget End
2013-04-30
Support Year
4
Fiscal Year
2010
Total Cost
$257,169
Indirect Cost
Name
Princeton University
Department
Engineering (All Types)
Type
Schools of Engineering
DUNS #
002484665
City
Princeton
State
NJ
Country
United States
Zip Code
08544
Guzman, Y A; Sakellari, D; Papadimitriou, K et al. (2018) High-throughput proteomic analysis of candidate biomarker changes in gingival crevicular fluid after treatment of chronic periodontitis. J Periodontal Res 53:853-860
Guzman, Yannis A; Sakellari, Dimitra; Arsenakis, Minas et al. (2014) Proteomics for the discovery of biomarkers and diagnosis of periodontitis: a critical review. Expert Rev Proteomics 11:31-41
Baliban, Richard C; Sakellari, Dimitra; Li, Zukui et al. (2013) Discovery of biomarker combinations that predict periodontal health or disease with high accuracy from GCF samples based on high-throughput proteomic analysis and mixed-integer linear optimization. J Clin Periodontol 40:131-9
Baliban, Richard C; Dimaggio, Peter A; Plazas-Mayorca, Mariana D et al. (2012) PILOT_PROTEIN: identification of unmodified and modified proteins via high-resolution mass spectrometry and mixed-integer linear optimization. J Proteome Res 11:4615-29
Li, Zukui; Floudas, Christodoulos A (2012) A Comparative Theoretical and Computational Study on Robust Counterpart Optimization: II. Probabilistic Guarantees on Constraint Satisfaction. Ind Eng Chem Res 51:6769-6788
Baliban, Richard C; Sakellari, Dimitra; Li, Zukui et al. (2012) Novel protein identification methods for biomarker discovery via a proteomic analysis of periodontally healthy and diseased gingival crevicular fluid samples. J Clin Periodontol 39:203-12
Li, Zukui; Ding, Ran; Floudas, Christodoulos A (2011) A Comparative Theoretical and Computational Study on Robust Counterpart Optimization: I. Robust Linear Optimization and Robust Mixed Integer Linear Optimization. Ind Eng Chem Res 50:10567-10603
Greco, G; Rosa, R; Beskin, G et al. (2011) Evidence of deterministic components in the apparent randomness of GRBs: clues of a chaotic dynamic. Sci Rep 1:91
Khoury, George A; Baliban, Richard C; Floudas, Christodoulos A (2011) Proteome-wide post-translational modification statistics: frequency analysis and curation of the swiss-prot database. Sci Rep 1:
DiMaggio Jr, Peter A; Subramani, Ashwin; Judson, Richard S et al. (2010) A novel framework for predicting in vivo toxicities from in vitro data using optimal methods for dense and sparse matrix reordering and logistic regression. Toxicol Sci 118:251-65

Showing the most recent 10 out of 16 publications