Computational Tools for Proteoform Identification by Top-Down Data Independent Acquisition Mass Spectrometry

Liu, Xiaowen; Ning, Xia; Sun, Liangliang

Abstract

Mass spectrometry-based top-down proteomics has become one of the most informative approaches in protein analysis because it provides the bird's-eye view of intact proteoforms (protein forms) generated from post-translational modifications and sequence variations. Data dependent acquisition and data independent acquisition are the two main methods in top-down mass spectrometry. The former has been the dominant one, but it has two main challenges in proteome-wide studies: low protein coverage: a regular experiment of human cells can identify only 200 ? 400 proteins, and low reproducibility: a technical triplet shares only about one third of identified proteoforms. Top-down data independent acquisition mass spectrometry (TD-DIA-MS) has the potential to significantly increase protein coverage and improve reproducibility in proteome-wide studies. However, its application has been hampered by the complexity of the data and the lack of efficient software tools. To address this problem, we will propose new algorithms and machine learning models and develop the first software package for proteoform identification by TD-DIA-MS. The proposed research will be conducted by a group of researchers with complementary expertise. All the proposed algorithms will be implemented as user-friendly open source software tools.

Public Health Relevance

This project addresses the proteoform identification problem by top-down data independent acquisition mass spectrometry. We will propose new machine learning models and new algorithms for high-throughput proteome-wide identification of complex proteoforms with post-translational modifications and sequence variations by using top- down data independent acquisition mass spectrometry. The proposed methods will facilitate the study of the function of complex proteoforms and the discovery of proteome biomarkers.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 2R01GM118470-05
Application #: 10049810
Study Section: Biodata Management and Analysis Study Section (BDMA)
Program Officer: Brazhnik, Paul

Project Start: 2016-06-01
Project End: 2024-08-31
Budget Start: 2020-09-01
Budget End: 2021-08-31
Support Year: 5
Fiscal Year: 2020
Total Cost
Indirect Cost

Institution

Name: Indiana University-Purdue University at Indianapolis
Department: Miscellaneous
Type: Schools of Arts and Sciences
DUNS #: 603007902

City: Indianapolis
State: IN
Country: United States
Zip Code: 46202

Related projects


NIH 2020 R01 GM	Computational Tools for Proteoform Identification by Top-Down Data Independent Acquisition Mass Spectrometry Liu, Xiaowen; Ning, Xia; Sun, Liangliang / Indiana University-Purdue University at Indianapolis
NIH 2019 R01 GM	Computational tools for top down mass spectrometry based proteoform identification and proteogenomics Liu, Xiaowen; Liu, Yunlong / Indiana University-Purdue University at Indianapolis
NIH 2018 R01 GM	Computational tools for top down mass spectrometry based proteoform identification and proteogenomics Liu, Xiaowen; Liu, Yunlong / Indiana University-Purdue University at Indianapolis
NIH 2017 R01 GM	Computational tools for top down mass spectrometry based proteoform identification and proteogenomics Liu, Xiaowen; Liu, Yunlong / Indiana University-Purdue University at Indianapolis	$262,866
NIH 2016 R01 GM	Computational tools for top down mass spectrometry based proteoform identification and proteogenomics Liu, Xiaowen; Liu, Yunlong / Indiana University-Purdue University at Indianapolis	$299,451

Publications

Li, Ziwei; He, Bo; Kou, Qiang et al. (2018) Evaluation of top-down mass spectral identification with homologous protein sequences. BMC Bioinformatics 19:494

McCool, Elijah N; Lubeckyj, Rachele A; Shen, Xiaojing et al. (2018) Deep Top-Down Proteomics Using Capillary Zone Electrophoresis-Tandem Mass Spectrometry: Identification of 5700 Proteoforms from the Escherichia coli Proteome. Anal Chem 90:5529-5533

McCool, Elijah N; Lubeckyj, Rachele; Shen, Xiaojing et al. (2018) Large-scale Top-down Proteomics Using Capillary Zone Electrophoresis Tandem Mass Spectrometry. J Vis Exp :

Shen, Xiaojing; Kou, Qiang; Guo, Ruiqiong et al. (2018) Native Proteomics in Discovery Mode Using Size-Exclusion Chromatography-Capillary Zone Electrophoresis-Tandem Mass Spectrometry. Anal Chem 90:10095-10099

Kou, Qiang; Wu, Si; Liu, Xiaowen (2018) Systematic Evaluation of Protein Sequence Filtering Algorithms for Proteoform Identification Using Top-Down Mass Spectrometry. Proteomics 18:

Fornelli, Luca; Ayoub, Daniel; Aizikov, Konstantin et al. (2017) Top-down analysis of immunoglobulin G isotypes 1 and 2 with electron transfer dissociation on a high-field Orbitrap mass spectrometer. J Proteomics 159:67-76

Ma, Hongyan; Delafield, Daniel G; Wang, Zhe et al. (2017) Finding Biomass Degrading Enzymes Through an Activity-Correlated Quantitative Proteomics Platform (ACPP). J Am Soc Mass Spectrom 28:655-663

Kou, Qiang; Wu, Si; Tolic, Nikola et al. (2017) A mass graph-based approach for the identification of modified proteoforms using top-down tandem mass spectra. Bioinformatics 33:1309-1316

Zhang, Xinjun; Li, Meng; Lin, Hai et al. (2017) regSNPs-splicing: a tool for prioritizing synonymous single-nucleotide substitution. Hum Genet 136:1279-1289

Yang, Runmin; Zhu, Daming; Kou, Qiang et al. (2017) A Spectrum Graph-Based Protein Sequence Filtering Algorithm for Proteoform Identification by Top-Down Mass Spectrometry. Proceedings (IEEE Int Conf Bioinformatics Biomed) 2017:222-229

Showing the most recent 10 out of 15 publications

Comments

Be the first to comment on Xiaowen Liu's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: