Mass spectrometry (MS) based proteomics has made remarkable advances over the past several years, now enabling the detection of 8,000 proteins in a single analysis, or with extensive fractionation, protein detection levels approaching that of transcriptomics can now be achieved. Despite such advances, a significant bias exists in the proteome routinely detected by MS due to the near exclusive use of trypsin for proteolytic cleavage during sample preparation. Trypsin is a wellsuited protease for proteomics, as it produces peptides with favorable chemical composition for MS analysis, however, it also locks a considerable fraction of the proteome in peptides either too small or too large for MS detection thus rendering these segments of the proteome effectively invisible in >99% of the proteomics experiments performed to date. Only a handful of global proteomics experiments have reported the use of alternative proteases, primarily due to the generally superior performance of trypsin, and the increased instrument time required to analyze additional samples a limiting factor for most labs. Recently developed MS approaches, specifically dataindependent acquisition (DIA), operate under new experimental and computational paradigms which rely on deconvolution of highly complex MS spectra and matching to peptide or spectral databases for detection. This new paradigm presents the opportunity to multiplex proteomic samples generated from a variety of different proteases in a single MS analysis. However, to date DIA analysis has been exclusively developed for tryptic peptides. Here we propose an innovative DIA acquisition and computational analysis approach to multiplex multiple proteases and unlock the hidden proteome. To achieve the goals of this proposal we will first optimize DIA for nontrypsin proteases, and then apply these optimized conditions in a DIA multiplexed setting with a mixture of different proteases. Lastly, we will further develop and apply this framework to the analysis of posttranslational modifications (e.g. phosphorylation) where increased proteome coverage is essential for modification detection and localization. Successful completion of this work will provide a robust framework to dramatically increase the proteome routinely detected and quantified in MS analysis. Importantly, all details of this workflow, from sample handling to software for data analysis, will be well documented in stepbystep online protocols and freely distributed to the community to enable rapid integration of this approach into modern proteomics workflows.

Public Health Relevance

Comprehensive analysis of the proteome is essential to enhancing our understanding of human biology and disease. However, this is prohibited by fundamental limitations in traditional proteomic approaches which render significant portions of the proteome inaccessible for analysis. This project aims to develop novel proteomic multiplexing and deconvolution approaches to unlock the hidden proteome and enable biological discoveries that require amino acidlevel resolution.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
1R01GM133981-01
Application #
9799746
Study Section
Enabling Bioanalytical and Imaging Technologies Study Section (EBIT)
Program Officer
Smith, Ward
Project Start
2019-09-15
Project End
2023-08-31
Budget Start
2019-09-15
Budget End
2020-08-31
Support Year
1
Fiscal Year
2019
Total Cost
Indirect Cost
Name
University of California San Francisco
Department
Pharmacology
Type
Schools of Medicine
DUNS #
094878337
City
San Francisco
State
CA
Country
United States
Zip Code
94118