This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. In addition to the occurrence of more than 1200 hemoglobin variants, post-translational modifications may also be implicated in the etiology of patient diseases. An MS-based technology platform that employs MALDI-TOF MS, VC-MALDI-FT MS, ESI-qQq-FT MS/MS and LC-MS/MS is being utilized to analyze human blood samples for the presence of variants and/or post-translational modifications on hemoglobin. While most variants have been identified consistent to their gene-based DNA sequence, PTMs and some variant types have been detected and located only by the current integrated methodology. Whole blood was diluted and cleaned up to remove cellular debris and salts. Trypsin digestion and AspN digestion of the intact globin chains were performed for peptide mass mapping and MS/MS. Intact hemoglobin chains were analyzed and top-down sequenced via ESI-qQq-FT MS/MS (on the quadrupole-FT hybrid constructed in-house), and the peptide mapping was performed on the same instrument. Additionally, digests were analyzed by MALDI-TOF MS, MALDI-FT MS/MS (with the vibrational-cooling MALDI-FT MS also constructed in-house), and online LC-MS/MS. Data were processed and searched against SwissProt and custom programmed Hemoglobin/PTM databases using commercially available software and software written in-house. Minimal requirements for purification, derivatization or separation of the blood samples considerably simplified the sample preparation and reduced artifacts associated with sample purification which may perturb PTMs. MALDI-TOF MS was carried out as first-pass for peptide mapping. More accurate mass mapping was achieved by VC-MALDI-FT MS. Nanospray ESI-qQq-FT MS was applied to the measurement of the intact hemoglobin chains. Calculation of the charge state and identification of the m/z of the monoisotopic mass was performed using software written in-house and yielded accurate mass measurement within a few ppm. Mutations and PTMs were observed at the intact protein level. Localization of the mutation(s) and PTMs was achieved using a combination of top-down sequencing, peptide mapping and MS/MS peptide sequencing. For online LC-MS/MS of the hemoglobin digests, data analysis was fully automated. An iterative approach is used for peptide sequencing, with a pre-programmed hemoglobin database and a pre-programmed PTM database. More than 75 clinically interesting samples, including diverse hemoglobin variants, have been identified using this MS-based proteomics approach. The results were consistent to their DNA sequencing results, and for some samples, showed new results that DNA analysis could not address. Additionally, post-translational modifications have also been revealed by this method. Standard database search approaches yielded poor sequence coverage of the expressed alpha, beta, delta, gamma chains of hemoglobin. Implementation of an iterative search process, including custom sequence and PTM databases, with inclusion of error-tolerant searches in Mascot, resulted in crucial improvements in the assignments. Up to 100% sequence coverage was observed and known sequence variants and PTMs were confirmed from a cumulative data set consisting of 34293 total spectra (including replicates), with 1991 unique peptides, 709 unique stripped peptides representing 40 unique proteins including co-purified plasma proteins. The extraneous plasma proteins made a substantial contribution to the data pool and their presence had a non-trivial impact on the FDR of PTM assignments. Within the hemoglobin assignments we observed PTMs that were ubiquitous in each chain of hemoglobin, such as multiple forms of amino acid oxidation that occurred on specific sequence regions. Less common PTMs, such as glutathionylation of cysteine and hexose modifications were also observed. In addition to statistical approaches, manual interpretation of automated assignments that were on the border of significance helped to discriminate true from false spectral matches. Global interpretation rules were developed in order to distinguish and categorize the false matches. Based on these approaches, a data analysis scheme was developed to maximize sequence coverage, identify sequence variants and known PTMs, as well as to discover novel PTMs. Additionally, emphasis was placed on minimizing the FDR. In-house bioinformatics tools were developed which work with the integrated peptide XML files from Trans-Proteomic Pipeline platform and output the list of PTMs. In summary, our complementary bioinformatics and data interrogation methodologies demonstrate the feasibility for characterizing primary structural changes, variants and PTMs in proteins and peptides. This approach may be ubiquitously applied within any proteomics analysis schema.
Showing the most recent 10 out of 253 publications