Mass spectrometry (MS) based proteomics has made remarkable advances over the past several years, now enabling the detection of 8,000 proteins in a single analysis, or with extensive fractionation, protein detection levels approaching that of transcriptomics can now be achieved. Despite such advances, a significant bias exists in the proteome routinely detected by MS due to the near exclusive use of trypsin for proteolytic cleavage during sample preparation. Trypsin is a wellsuited protease for proteomics, as it produces peptides with favorable chemical composition for MS analysis, however, it also locks a considerable fraction of the proteome in peptides either too small or too large for MS detection thus rendering these segments of the proteome effectively invisible in >99% of the proteomics experiments performed to date. Only a handful of global proteomics experiments have reported the use of alternative proteases, primarily due to the generally superior performance of trypsin, and the increased instrument time required to analyze additional samples a limiting factor for most labs. Recently developed MS approaches, specifically dataindependent acquisition (DIA), operate under new experimental and computational paradigms which rely on deconvolution of highly complex MS spectra and matching to peptide or spectral databases for detection. This new paradigm presents the opportunity to multiplex proteomic samples generated from a variety of different proteases in a single MS analysis. However, to date DIA analysis has been exclusively developed for tryptic peptides. Here we propose an innovative DIA acquisition and computational analysis approach to multiplex multiple proteases and unlock the hidden proteome. To achieve the goals of this proposal we will first optimize DIA for nontrypsin proteases, and then apply these optimized conditions in a DIA multiplexed setting with a mixture of different proteases. Lastly, we will further develop and apply this framework to the analysis of posttranslational modifications (e.g. phosphorylation) where increased proteome coverage is essential for modification detection and localization. Successful completion of this work will provide a robust framework to dramatically increase the proteome routinely detected and quantified in MS analysis. Importantly, all details of this workflow, from sample handling to software for data analysis, will be well documented in stepbystep online protocols and freely distributed to the community to enable rapid integration of this approach into modern proteomics workflows.

Public Health Relevance

Comprehensive analysis of the proteome is essential to enhancing our understanding of human biology and disease. However, this is prohibited by fundamental limitations in traditional proteomic approaches which render significant portions of the proteome inaccessible for analysis. This project aims to develop novel proteomic multiplexing and deconvolution approaches to unlock the hidden proteome and enable biological discoveries that require amino acidlevel resolution.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Enabling Bioanalytical and Imaging Technologies Study Section (EBIT)
Program Officer
Gindhart, Joseph G
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Francisco
Schools of Medicine
San Francisco
United States
Zip Code