This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. Examining LC-MS/MS raw data of complex peptide mixtures (e.g., those found in major histocompatibility complexes or in protein digests encountered in proteomics) shows that only a fraction of the spectra are of sufficient intensity to give good quality product ion spectra. Moreover many of the 'good quality' spectra are of non-peptidic contaminates and obviously don''t yield sequence information. If all the extraneous spectra could be identified and eliminated from the search beforehand, a much more efficient and fast search could be preformed. Analyzing MS/MS statistical data using multivariate methods and including chemically relevant classifiers can improve the discrimination. A new computer program was written using PERL. The program extracts statistical information from each raw spectrum. The collected data are tabulated and taken to a statistical package to calculate a statistical model using Partial Least Squeres method. The correlation between the predicted score to the actual one is sufficient to eliminate a large percentage of the bad spectra. Since only spectra with a peptidic signature will be retained, spectra that remained unidentified will be good candidates for denovo or error tolerant database search sequencing.
Showing the most recent 10 out of 696 publications