This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. A suite of data reduction algorithms, called MasSPIKE, has been developed. It includes methods for modelling noise, isotopic cluster identification, charge state determination, picking isotopic peaks, and alignment of experimental and theoretical isotopic distributions for isotopically resolved mass spectra. The final output is an accurate and complete list of the monoisotopic mass information in the spectrum, which can be used to search the protein databases with high mass accuracy. If the sequence of the biomolecule under consideration is known, the observed masses are matched against the possible fragment ions and common losses and adducts. Previously developed algorithms for isotopic cluster identification were improved in order to identify overlapping clusters of very low and high charge states. For a given molecular mass, elemental composition was determined using the averagine model for proteins. Rockwood's Mercury6 algorithm was incorporated into the program to generate theoretical isotopic distributions for a given elemental composition. The previously reported charge state determination routine (Matched Filter approach) was modified using improved normalization of the experimental and theoretical isotopic distributions before determining the charge state. Special focus has been given to resolving overlapping isotopic distributions with different charge states. All these methods have been integrated into MasSPIKE. MasSPIKE was successful in resolving 9 overlapping isotopic distributions, with multiple distributions sharing isotopic peaks, in the spectrum of a biologically derived sample (Ubch10 protein from an immunoprecipitated whole cell lysate). The complete analysis of a top down spectrum of Bovine Carbonic Anhydrase elucidated 165 isotopic distributions which were assigned to the closest theoretical masses of the fragments. A typical case is shown in Fig 1. The Matched Filter approach used for charge state determination was found to work correctly in 91% cases when a total of 775 isotopic distributions of myoglobin with charge states ranging from 8-22 and S/N ranging from 2-100 were investigated. An isotope alignment module has been developed to align the experimental and theoretical distributions and was tested using 3150 Monte Carlo simulations, with each simulation generated from 100 ions. These tests revealed that the Maximum Likelihood alignment method correctly aligns 85% of the time, as compared to the least squares error method which gave 76% correct results. MasSPIKE was applied to the intact protein and top-down ESI spectra obtained from a number of samples obtained from biological sources. Alpha chain of Hemoglobin from human blood samples was used to distinguish between a mutated alpha variant and a normal person. Beta chain of Hemoglobin revealed differences between a beta sickle and normal beta chain. Monoisotopic masses for large intact proteins (20-23 kDa) were determined from a single isotopic cluster with an accuracy of less than 5 ppm.
Showing the most recent 10 out of 253 publications