Most proteins in an organism are reversibly modified by the covalent attachment of chemical groups to specific amino-acid residues. Such post-translational modification (PTM) allows protein function to be modulated on a physiological time-scale. PTMs are involved in regulating most cellular processes and they are frequently disordered in major diseases such as cancer and Alzheimer's. Our understanding of their role has been hampered by their extraordinary complexity. It is not uncommon for a protein to have many types of modification (phosphorylation, methylation, ADP-ribosylation, ubiquitination, etc.) on multiple sites and for modifications on different sites to interact combinatorially to influence protein function. This can create an explosion of combinatorial modification states. For instance, the tumor suppressor and guardian of the genome p53, on which this proposal will focus, has more than 100 sites of modification, creating the potential for more than 1030 combinatorial modification states on this single protein. Of course, very few of these states will be present in any cellular condition but different conditions may elicit different patterns of modification. This had led many researchers to suggest that combinatorial PTMs provide some form of code for cellular information processing, such as the p53 code, the tubulin code and the histone code. However, while correlations between patterns of modification and downstream responses have been shown, causality has not been demonstrated and the concept of code remains, at best, a metaphor. In previous work, we have laid a foundation for addressing this problem. We have introduced the concepts of mod-form, for a pattern of modification across the whole protein, and mod-form distribution, for the proportions of each mod-form in a sample. We have developed mass- spectrometry (MS) techniques in which conventional bottom-up peptide-based MS is combined with newer top-down, or whole-protein, MS to quantitatively constrain the mod-form distribution of typical cellular proteins with small numbers (<5) of modified sites. We have also developed a new mathematical framework which shows that the combinatorial explosion in mod-forms disappears mathematically at steady-state, offering a way to overcome the complexity barrier. Here, we build on this foundation by bringing together an outstanding group of collaborators, with expertise that spans cell biology, mass spectrometry, mathematics and computation, to unravel whether and how endogenous p53 in human cells uses PTMs to encode information. We have already obtained the first top-down mass-spectrum of intact p53. By focusing on such a challenging exemplar, we expect to learn a great deal about p53 itself while developing concepts and methods that can be widely applied to other cellular proteins in which PTMs play a key role.

Public Health Relevance

Most proteins in an organism are chemically modified on a reversible basis and disruptions to such 'post-translational modification' (PTM) play an important role in diseases like cancer and Alzheimer's. It has been suggested that PTM provides some kind of 'code' that goes beyond the organism's genetic code but the extraordinary complexity of PTM has hampered efforts to understand this. In this multi-disciplinary proposal, we build upon previous experimental and theoretical advances to analyze the tumor suppressor and 'guardian of the genome' p53, which is modified at over 100 locations, and expect to learn much more about how p53 encodes information, while developing methods that can be widely applied to other proteins in which PTM plays a key role.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Modeling and Analysis of Biological Systems Study Section (MABS)
Program Officer
Dunsmore, Sarah
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Harvard Medical School
Schools of Medicine
United States
Zip Code
Compton, Philip D; Kelleher, Neil L; Gunawardena, Jeremy (2018) Estimating the Distribution of Protein Post-Translational Modification States by Mass Spectrometry. J Proteome Res 17:2727-2734
Malleshaiah, Mohan; Padi, Megha; Rué, Pau et al. (2016) Nac1 Coordinates a Sub-network of Pluripotency Factors to Regulate Embryonic Stem Cell Differentiation. Cell Rep 14:1181-1194