This subproject is one of many research subprojects utilizing theresources provided by a Center grant funded by NIH/NCRR. The subproject andinvestigator (PI) may have received primary funding from another NIH source,and thus could be represented in other CRISP entries. The institution listed isfor the Center, which is not necessarily the institution for the investigator.Shotgun proteomics uses liquid chromatography-tandem mass spectrometry to identify the complement of protein molecules in a complex biological sample. This technology is extremely powerful; however, current scoring algorithms fail to discriminate accurately between correct and incorrect peptide identifications. We describe a computational method for improving the rate of peptide identifications from a given collection of mass spectra. The method, called Percolator, post-processes candidate identifications generated by a database search algorithm such as SEQUEST. Percolator uses a semi-supervised machine learning technique to discriminate dynamically between correct and decoy spectrum identifications. Relative to a fully supervised approach, our method correctly assigns peptides to 17% more spectra from a tryptic microLC-MS/MS dataset and up to 75% more spectra from non-tryptic digests.
Showing the most recent 10 out of 583 publications