This proposal will provide information, new algorithms, and computational tools for predicting proteolytic events. The ultimate goal is to make accurate proteome-wide predictions of the substrates for any given protease. However, our current effort will focus mainly on matrix metalloproteases (MMPs), caspases, and several protein convertases (PCs) belonging to the serine protease family because a vast amount of experimental information on those proteases is already available at the Sanford-Burnham Medical Research Institute. Our approach can be easily extended to any other proteases when a statistically significant number of substrates become available for deriving a specificity profile. The unique feature of the proposed prediction method is combining sequence-based predictions with other factors. These include: structural features of the substrates, cooperative interactions, and co-localization and co-expression of substrates and proteases. We will also include information about SNPs (single nucleotide polymorphisms) and PTMs (posttranslational modifications) of the residues in the vicinity of the cleavage sites in protein substrates. These two effects can modify the proteolytic event by turning it off or by creating a new possible cleavage site. Such modifications can lead to diseases or syndromes. The proteolytic events, e.g., protease-substrate pairs, will be mapped onto the known regulatory networks. All the information that is collected and tools that are developed will be freely available on the PMAP Web site (www.proteolysis.org) for use by the biomedical research community. Because proteases usually have more than a dozen substrates, and because the substrates often differ in normal physiology vs. pathology, the impact of this project could be immense. Rather than identifying protease substrates on a one-by-one basis, our predictions will produce very-well-annotated sets of substrates that will likely have biological significance.

Public Health Relevance

Proteolysis is a biological process involving hydrolysis of the peptide bonds in proteins. We propose to design a computational approach for predicting substrates for proteinases in human proteome that takes into account accurate amino acid sequence specificity and structural and biological factors. This computational approach will help detect aberrations in the processing, regulation, and degradation of proteins leading to disease or syndromes.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM098835-02
Application #
8333323
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Preusch, Peter C
Project Start
2011-09-30
Project End
2015-08-31
Budget Start
2012-09-01
Budget End
2013-08-31
Support Year
2
Fiscal Year
2012
Total Cost
$362,900
Indirect Cost
$172,900
Name
Sanford-Burnham Medical Research Institute
Department
Type
DUNS #
020520466
City
La Jolla
State
CA
Country
United States
Zip Code
92037
Kumar, Sonu; van Raam, Bram J; Salvesen, Guy S et al. (2014) Caspase cleavage sites in the human proteome: CaspDB, a database of predicted substrates. PLoS One 9:e110539
Ratnikov, Boris I; Cieplak, Piotr; Gramatikoff, Kosi et al. (2014) Basis for substrate recognition and distinction by matrix metalloproteinases. Proc Natl Acad Sci U S A 111:E4148-55
Shiryaev, Sergey A; Aleshin, Alexander E; Muranaka, Norihito et al. (2014) Structural and functional diversity of metalloproteinases encoded by the Bacteroides fragilis pathogenicity island. FEBS J 281:2487-502
Belushkin, Alexander A; Vinogradov, Dmitry V; Gelfand, Mikhail S et al. (2014) Sequence-derived structural features driving proteolytic processing. Proteomics 14:42-50
Shiryaev, Sergey A; Chernov, Andrei V; Golubkov, Vladislav S et al. (2013) High-resolution analysis and functional mapping of cleavage sites and substrate proteins of furin in the human proteome. PLoS One 8:e54290