Mass spectrometry-based top-down proteomics has become one of the most informative approaches in protein analysis because it provides the bird's-eye view of intact proteoforms (protein forms) generated from post-translational modifications and sequence variations. Data dependent acquisition and data independent acquisition are the two main methods in top-down mass spectrometry. The former has been the dominant one, but it has two main challenges in proteome-wide studies: low protein coverage: a regular experiment of human cells can identify only 200 ? 400 proteins, and low reproducibility: a technical triplet shares only about one third of identified proteoforms. Top-down data independent acquisition mass spectrometry (TD-DIA-MS) has the potential to significantly increase protein coverage and improve reproducibility in proteome-wide studies. However, its application has been hampered by the complexity of the data and the lack of efficient software tools. To address this problem, we will propose new algorithms and machine learning models and develop the first software package for proteoform identification by TD-DIA-MS. The proposed research will be conducted by a group of researchers with complementary expertise. All the proposed algorithms will be implemented as user-friendly open source software tools.

Public Health Relevance

This project addresses the proteoform identification problem by top-down data independent acquisition mass spectrometry. We will propose new machine learning models and new algorithms for high-throughput proteome-wide identification of complex proteoforms with post-translational modifications and sequence variations by using top- down data independent acquisition mass spectrometry. The proposed methods will facilitate the study of the function of complex proteoforms and the discovery of proteome biomarkers.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
2R01GM118470-05
Application #
10049810
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
2016-06-01
Project End
2024-08-31
Budget Start
2020-09-01
Budget End
2021-08-31
Support Year
5
Fiscal Year
2020
Total Cost
Indirect Cost
Name
Indiana University-Purdue University at Indianapolis
Department
Miscellaneous
Type
Schools of Arts and Sciences
DUNS #
603007902
City
Indianapolis
State
IN
Country
United States
Zip Code
46202
Li, Ziwei; He, Bo; Kou, Qiang et al. (2018) Evaluation of top-down mass spectral identification with homologous protein sequences. BMC Bioinformatics 19:494
McCool, Elijah N; Lubeckyj, Rachele A; Shen, Xiaojing et al. (2018) Deep Top-Down Proteomics Using Capillary Zone Electrophoresis-Tandem Mass Spectrometry: Identification of 5700 Proteoforms from the Escherichia coli Proteome. Anal Chem 90:5529-5533
McCool, Elijah N; Lubeckyj, Rachele; Shen, Xiaojing et al. (2018) Large-scale Top-down Proteomics Using Capillary Zone Electrophoresis Tandem Mass Spectrometry. J Vis Exp :
Shen, Xiaojing; Kou, Qiang; Guo, Ruiqiong et al. (2018) Native Proteomics in Discovery Mode Using Size-Exclusion Chromatography-Capillary Zone Electrophoresis-Tandem Mass Spectrometry. Anal Chem 90:10095-10099
Kou, Qiang; Wu, Si; Liu, Xiaowen (2018) Systematic Evaluation of Protein Sequence Filtering Algorithms for Proteoform Identification Using Top-Down Mass Spectrometry. Proteomics 18:
Fornelli, Luca; Ayoub, Daniel; Aizikov, Konstantin et al. (2017) Top-down analysis of immunoglobulin G isotypes 1 and 2 with electron transfer dissociation on a high-field Orbitrap mass spectrometer. J Proteomics 159:67-76
Ma, Hongyan; Delafield, Daniel G; Wang, Zhe et al. (2017) Finding Biomass Degrading Enzymes Through an Activity-Correlated Quantitative Proteomics Platform (ACPP). J Am Soc Mass Spectrom 28:655-663
Kou, Qiang; Wu, Si; Tolic, Nikola et al. (2017) A mass graph-based approach for the identification of modified proteoforms using top-down tandem mass spectra. Bioinformatics 33:1309-1316
Zhang, Xinjun; Li, Meng; Lin, Hai et al. (2017) regSNPs-splicing: a tool for prioritizing synonymous single-nucleotide substitution. Hum Genet 136:1279-1289
Yang, Runmin; Zhu, Daming; Kou, Qiang et al. (2017) A Spectrum Graph-Based Protein Sequence Filtering Algorithm for Proteoform Identification by Top-Down Mass Spectrometry. Proceedings (IEEE Int Conf Bioinformatics Biomed) 2017:222-229

Showing the most recent 10 out of 15 publications