Protein identification is a major research perspective in proteomics holding the promise of signaling and treating diseases. Several techniques have been developed for protein identification. Among which, tandem mass spectrometry (MS/MS) is currently the most popular technique used to identify proteins. The process of identifying proteins by tandem mass spectrometry is analogous to identifying a person using his/her fingerprints. Since a MS/MS spectrum usually is incomplete and contains noise peaks due to contaminants, poor peptide segmentation, and other technical or biological reasons, it is a challenging problem to protein identification. The problem becomes more difficult when the spectrum contains post-translational modifications (PTMs). The presence of PTMs significantly increases the difficulty of both de novo sequencing and database search.

This research addresses this challenging problem of protein identification with theoretical studies and computational approaches. This work builds a complete system focusing on identifying proteins by determining the amino acid sequences of the proteins from their MS/MS spectra. In particular, for a given experimental tandem mass spectrum, the system identifies its amino acid sequence and PTMs through a series of activities: (a) separating different ion types and noises, (b) generating sequence tags, c) identifying candidate peptide sequences, and (d) verifying the candidate peptide sequences and identifying PTMs.

Project Start
Project End
Budget Start
2009-08-01
Budget End
2013-07-31
Support Year
Fiscal Year
2008
Total Cost
$400,000
Indirect Cost
Name
Howard University
Department
Type
DUNS #
City
Washington
State
DC
Country
United States
Zip Code
20059