This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator.
Specific Aim 2 seeks to improve proteome analyses through the development of new methods, instrumentation, and techniques. The following six sections describe areas of progress under Specific Aim 2. ANN-based Predictor for Peptide Elution Time The artificial neural network (ANN)-based peptide elution time predictor was further improved this past year. The improvements were a result of a larger training test and a more complex ANN architecture. An ANN architecture that consisted of 1052 input nodes, 40 hidden nodes, and 1 output node was used to fully consider the amino acid residue sequence in each peptide, their length, and hydrophobic moment. The network was trained using 526,981 non-redundant peptides identified from a total of >15,000 LC-MS/MS analyses of >25 different organisms and the predictive capability of the model was tested using 2,752 confidently identified peptides that were not included in the training set. The model demonstrated an average elution time precision of <1.5% and a correlation between observed and predicted elution times of 0.973. The model is now able to accurately predict the elution time of both isomeric and isobaric peptides which is not possible by any other model. In comparison with models published by other groups, the present model is about 100% more precise. We believe that the present model has reached a point where only marginal improvements can be achieved in the future. As a result, we plan to shift our efforts to new predictive capabilities that will enhance our proteomic efforts such as the development of a predictor that can assign the likelihood of a peptide to be identified/detected by mass spectrometry. Furthermore, we plan to use fractionation techniques that separate peptides according to their isoelectric points in order to take advantage of the existent predictive adaptabilities for the calculation of the theoretical peptide isoelectric point. We plan to use these future developed capabilities in conjunction with elution time prediction and mass accuracy to increase the confidence of our peptide identifications. Developments in High Efficiency Reversed Phase LC Separations New developments in reversed phase (RP) LC separations have been achieved using 50 m i.d. fused silica capillaries packed with micron-sized C18-bonded porous silica particles attaining peak capacities of 130-420. When these RPLC separations were combined with a linear ion trap mass spectrometer, ~1,000 proteins could be identified in 50 minutes based upon the identification of ~4,000 tryptic peptides; ~550 proteins in 20 minutes from ~1,800 peptides; and ~250 proteins in 8 minutes from ~700 peptides for a S. oneidensis tryptic digest. We found that 55% of the MS/MS spectra acquired during the entire analysis (and up to 100% of the MS/MS spectra acquired from the most data-rich zone) had sufficient quality for identifying peptides. The results indicate that such analyses using very fast (minutes) RPLC separations based on columns packed with micro-sized porous particles are primarily limited by the MS/MS analysis speed. We have explored the basis for ultrahigh-throughput proteomics measurements using high-speed RPLC combined with high accuracy mass spectrometric measurements. TOF and FTICR mass spectrometers were evaluated in conjunction with 0.8- m porous C18 particle-packed RPLC capillary columns (50 m i.d.) for identifying peptides using the Accurate Mass and Time (AMT) tag approach. Peptide RPLC relative retention (elution) times were correlated to within 5% to elution times that differed by at least 25-fold in speed, which allowed peptides to be identified using AMT tags identified from much slower RPLC-MS/MS analyses. When coupled with RPLC, the mass spectrometers operated at fast spectrum acquisition speeds (e.g., 0.2 sec for TOF and either 0.3 or 0.6 sec for FTICR), and peptide mass measurement accuracies of better than 15 ppm were obtained. Ion population control during fast separations improved the mass accuracies obtained with FTICR, but the detection of low abundance species was somewhat suppressed in the fast analyses. The proteome coverage obtained using the AMT tag approach was limited by the separation peak capacity, the sensitivity of the MS, and the accuracy of both the mass measurements and the relative RPLC peptide elution times. Experimental results demonstrated that accuracies of 5% for the RPLC relative retention times and 15 ppm for mass measurements were sufficient for confident identification of >2800 peptides and >760 proteins from >13,000 different detected species. The approach allowed ~600 proteins from a S. oneidensis sample to be identified from assignment of ~2000 peptides in 150 sec. The TOF instrumentation was found to be advantageous for faster separations (of <120 sec), while FTICR MS was more effective for analysis times of >150 sec due to the improved mass accuracies achievable with longer spectrum acquisition times and better ion population control. The present work demonstrates the feasibility of very high throughput proteomics measurements and indicates additional significant improvements in throughput are achievable by further increasing the speed of high peak capacity separations, as well by increasing the measurement sensitivity and the accuracy of mass measurements. High-Throughput Proteome-Wide Analysis of Intact Proteins During the past year, we focused on the development of a novel capability for high-throughput proteome-wide analysis of intact proteins by combining bottom-up proteomics with high-accuracy FTICR intact protein mass measurements and targeted tandem mass spectrometry (MS/MS) employing variety of dissociation schemes (e.g., collisionally induced and electron capture dissociation; CID and ECD, respectively). The ultimate goal is to develop a tool set for comparative proteomics at the intact protein level that includes methods to accurately quantify changes in the levels of proteins and protein post-translational modifications (PTMs). Since protein MS/MS (particularly with ECD) is currently too slow to be effective for on-line separations, we opted to develop a profiling technique based on 2D separations coupled with FTICR MS (e.g., mass vs. retention time maps), and target exclusively discriminatory proteins for MS/MS; this methodology is novel in that it targets intact proteins (and not tryptic peptides), hence offering significant advantage for comparative proteomic measurements. We have an NCRR collaboration with Dr. Thomas Squier s group at PNNL to characterize the dynamics of protein modifications induced by reactive oxygen and nitrogen species in macrophage cells. Tyrosine phosphorylation is an essential part of cellular signaling, and nitrotyrosine formation could block or mimic phosphorylation. It has recently been argued that tyrosine nitration is also a dynamic process that leads to cellular signaling in radical rich environments, such as the mitochondria. The dynamic process of nitration and denitration, dependent on specific cellular conditions, lead some to speculate upon the existence of a denitrase enzyme. We have shown that the induction of radical generation in macrophages stimulates increased clearance of nitrated forms of the ubiquitous calcium sensor protein calmodulin and tentatively identified calmodulin as a substrate for putative denitrase in macrophages. We plan to follow up with an in-depth characterization of the changes in key complexes associated with oxidative stress; this research will shed light not only on the mechanisms of macrophage activation, but also on dynamics of protein nitration within the cell. We have an NCRR collaboration with Dr. Brian Thrall s group at PNNL to characterize the secreted proteome from a human mammary epithelial cell (HMEC) line and quantify changes in secreted proteins observed during phorbol 12-myristate 13-acetate (PMA) activation, a known tumor promoter and potent activator of protein secretion/shedding. Considering the significant role that excreted proteins play in the survival, proliferation, and differentiation of cells, it is important to develop methods that specifically detect, identify, and quantify these proteins. Intact protein studies will nicely complement already gathered bottom-up proteomics data. We have also initiated a new NCRR collaboration with Dr. Nahum Sonenberg at McGill University to study translational control mechanisms. A top-down mass spectrometric approach employing a 12T FTICR mass spectrometer equipped with ECD capability will be employed to characterize mouse brain-derived translational repressors that are subjected to developmentally regulated (e.g., adult vs. early postnatal brain) PTMs. These currently unknown PTMs may reveal a novel control mechanism for protein synthesis in the brain. Developments with Monolithic LC Columns In the past year high-efficiency 70 cm x 20 m i.d. silica-based monolithic capillary LC columns have been prepared. The monolith appears as a porous network with ~3 m pores. This denser sol-gel skeleton not only decreases the mass transfer resistance from the mobile phase to the stationary phase, but also increases the surface area of the silica skeleton, which determines the sample loading capacity of the column, as well as analyte retention. The columns at a mobile phase pressure of 5000 psi provide flow rates of ~40 nL/min at an optimum linear velocity of ~0.24 cm/s. The columns provide a separation peak capacity of ~420 in conjunction with both on-line coupling with micro solid phase extraction (SPE) and ESI-MS. The sensitivity of the monolithic columns for protein identification was evaluated using a BSA tryptic digest. A sample containing 15-attomole BSA tryptic digest in 10- L solution was loaded onto the on-line SPE 50 m i.d. monolithic column, separated by the 20 m i.d. monolithic column, and analyzed by ESI ion trap-MS/MS with peptide identification using the SEQUEST algorithm. As an example of the sensitivity achieved, three doubly charged tryptic peptides were confidently identified with high SEQUEST scores from 15-attomole of tryptically digested BSA. Application of the high sensitivity on-line microSPE-nanoLC-MS to the more complex S. oneidensis tryptically digested proteomic sample enabled identification of 855 proteins from a single 10 h nanoLC-MS/MS analysis. To further improve the sensitivity and reduce the ion suppression effect, 10- m-i.d. silica-based monolithic LC columns were fabricated with integral nanoESI emitters (i.e. from a single fused-silica capillary) and combined with a 50- m-i.d. SPE pre-column. The 25 cm long x 10- m-i.d. monolithic LC columns provide optimum flow rates of ~10 nL/min at pressures of ~1000 psi. A more hydrophilic SPE column (packed with more hydrophilic YMC ODS-AQ packing material) was used to reduce the effect of the dead volume between the SPE column and the analytical column on the separation. A 5-attomole BSA tryptic digest in 1- L solution was analyzed using the integrated monolithic column interfaced to a conventional ion trap (LCQ) for MS/MS. Multiple peptides were identified using the SEQUEST program. As an example of the high sensitivity achieved with a much more complex mixture of peptides, analysis of 100 ng of a tryptic digest of soluble S. oneidensis proteins allowed 1332 proteins to be confidently identified (from 5164 different peptides) in a 3 h analysis using a different linear ion trap (LTQ) for MS/MS. In addition, at a regime of 10 nL/min flow rates, the compound-to-compound variations in MS response are minimized and MS response is expected to vary more linearly with concentration. The good linear relationships between peptide MS responses and sample amount for BSA tryptic digests are consistent with this expectation. The use of an integrated 10- m-i.d. silica-based monolithic column has demonstrated good separation efficiency, as well as greatly improved ESI-MS sensitivity compared to columns of conventional dimensions. This combination of advances is anticipated to provide greater sensitivity for a broad range of proteomics applications, and facilitate the use of label-free quantitation methods by avoiding most contributions due to ion suppression effects. Developments on Multi-Nano-ESI Emitters The three main advantages to performing ESI at nL/min flow rates is (1) the reduced flow rate decreases initial droplet size improving desolvation, (2) there is more excess charge available per analyte, and (3) there is less charge competition improving quantitation. However, HPLC separations routinely use ?L/min flow rates. Over the past year our aim has been to split the higher flow into several smaller flows and create nano-ESI from each (creating nano-ESI from ?L/min flow rates). We are exploring two different approaches to accomplish this goal; (1) ESI emitters made from monolithic columns and (2) microfabricated, multi-emitter chips. The monolithic-based ESI emitters are made from a short section of an HPLC monolithic column. The porous monolith creates several flow paths (splits the high flow into several smaller flows) and the rough surface of the monolith at the emitter tip promotes the formation of multiple electrosprays (creating several nano-electrosprays from a ?L/min flow rate). Initial results show an ~10-fold increase in ESI current compared to a standard ESI emitter using a the same solution and flow rate. Additionally, the mapping of the current density of the ESI plume(s) shows increased ion production and the acquisition of mass spectra show increased peak intensities indicating the formation of multiple electrosprays from the emitters. This work has also led to an application of the technology with low-flow HPLC-MS/MS using a 10 ?m i.d. monolithic column with an incorporated monolithic ESI emitter for proteomic analyses. We are currently in the fabrication process of the multi-emitter chips. In contrast to the monolithic emitters, the chips have a discrete number of emitters and microfabricated channels to split the solution flow. Once the chips are completed, we will test, characterize, and compare them to the standard and monolithic emitters. In addition, this work is leading us to research better ways to transmit the resulting increased ion population into the first vacuum stage of the mass spectrometer which will give way to a significant improvement in instrument sensitivity. Progress on the Characterization of the Human Blood Plasma Proteome The blood plasma proteome has been widely recognized for its significant potential in providing diagnostic or therapeutic biomarkers for various diseases, as well as its potential contribution to personalized medicine; however, it also represents the most challenging mammalian proteome to be characterized due to the tremendous complexity and extraordinary dynamic range in protein concentrations. To address the challenge of plasma proteome characterization, we have developed a divide-and-conquer strategy that combines immunoaffinity depletion of the top 12 highly abundant plasma proteins, high efficiency enrichment of cysteinyl-peptides and N-linked glycopeptides, and two-dimensional LC-MS/MS analyses. In addition, a set of criteria were established to ensure the high confidence of plasma protein identifications based on a probability-based evaluation model. By applying this strategy to a pooled trauma patient plasma sample, we have achieved the highest confidence dataset of 3654 plasma proteins based upon 22,300 different peptide identifications to date with an overall dynamic range of detection of ~108. Among the 3654 proteins, 1494 proteins were identified by at least two peptides per protein (>99% confidence) and the other ~2100 proteins were identified with >90% confidence. The tremendous depth of plasma proteome coverage achieved by applying this approach demonstrates its potential for discovering candidate disease biomarkers for subsequent quantitative clinical applications.
Showing the most recent 10 out of 350 publications