Recovering Proteoforms from Cardiovascular Omics Datasets: A Multi-omics Secondary Analysis

Lam, Maggie

Abstract

Large-scale omics techniques including proteomics and RNA-seq have become important tools to identify disease mechanisms and therapeutic targets. However, these experiments have largely not considered ?proteoforms? - protein variants coded by the same gene such as through alternative splicing and post- translational modifications that can serve different cellular functions and whose distributions are often permuted in disease. In the heart in particular, alternative splicing is implicated in broad pathological processes in heart failure and cardiomyopathy, but at present we have a poor understanding of the expression status and molecular functions of many alternative splice isoform products at the protein level. Recently we have developed and optimized a computational pipeline which can integrate information from RNA-seq and proteomics data to recover lost protein isoform information from proteomics data. Our goal now is to perform a targeted secondary analysis of publicly available quantitative proteomics data on heart diseases that are housed in persistent data repositories. Specifically, Aim 1 will (i) identify and quantify alternative splice isoforms in heart failure and atrial fibrillation proteomics data, by using custom sequence databases constructed from RNA-seq data; and (ii) determine the intersections between AS isoforms with PTM sites at regulatory hotspots, with the aid of mass-tolerant open-search algorithms that can recover unexpected PTMs in proteomics data. By reanalyzing existing datasets with our pipeline we aim to extract isoform-level knowledge on existing data, which we are confident will have a strong likelihood to open unforeseen avenues into the research of heart diseases, and also add value to the existing rich data resources in our research community.

Public Health Relevance

Proteins variants from the same gene often have different functions due to biochemical modifications of their amino acid sequences, but these differences are often not resolved in large-scale studies of cardiac diseases due to technical limitations. Here we propose to perform a secondary analysis of protein expression datasets in the public domain to extract hidden information on proteoforms using a computational approach we recently developed. If successful, the results of the study may improve researchers' ability to discern a new class of disease biomarkers (changes in variant proteins), which can in turn help diagnose and prognosticate the progression of heart diseases.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Heart, Lung, and Blood Institute (NHLBI)
Type: Exploratory/Developmental Grants (R21)
Project #: 5R21HL150456-02
Application #: 10084750
Study Section: Special Emphasis Panel (ZRG1)
Program Officer: Papanicolaou, George

Project Start: 2020-01-15
Project End: 2021-12-30
Budget Start: 2020-12-31
Budget End: 2021-12-30
Support Year: 2
Fiscal Year: 2021
Total Cost
Indirect Cost

Institution

Name: University of Colorado Denver
Department: Internal Medicine/Medicine
Type: Schools of Medicine
DUNS #: 041096314

City: Aurora
State: CO
Country: United States
Zip Code: 80045

Related projects


NIH 2021 R21 HL	Recovering Proteoforms from Cardiovascular Omics Datasets: A Multi-omics Secondary Analysis Lam, Maggie / University of Colorado Denver
NIH 2020 R21 HL	Recovering Proteoforms from Cardiovascular Omics Datasets: A Multi-omics Secondary Analysis Lam, Maggie / University of Colorado Denver

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: