Proteins must fold themselves up into specific three-dimensional shapes to perform tasks required by the cell. On the other hand, when proteins fold into the wrong shape, they can disrupt the normal functioning of the cell, and are associated with a wide range of maladies including Alzheimer's and Parkinson's diseases. How is it that proteins ‘know’ how to fold into the correct structures (at least, most of the time)? Simple proteins typically have the property that after being disassembled in a test tube, they can spontaneously refold back into their native structures. This implies that their native states are also their most stable forms, and therefore thermodynamics naturally favor their formation. However, most studies of protein folding to date focus on a small number of simple model systems and leave out the fact that cells have thousands of different proteins, many of which are bigger and more complicated than the ones that are typically amenable for biophysical characterization. The purpose of this project is to apply modern mass spectrometry (MS) proteomics methods – which can routinely characterize thousands of proteins in complex mixtures – to problems in protein folding. These tools will enable the exploration of many classes of proteins whose folding has never been interrogated. Moreover, since the majority of the concepts and theories about protein folding have been based on a heretofore “privileged” subset of model proteins, the expectation is that this research will “leave the fold” and shift current paradigms on protein folding. The project will also provide opportunities to engage individuals that are underrepresented in the scientific community and workforce. One of these activities targets first-generation undergraduates and integrates closely with the research plan. The second activity is designed to meet a pressing need in the Baltimore area to introduce more STEM career development resources to low-income communities. A web database aimed at increasing the accessibility of proteomic data generated by this research project to the biophysical community will be developed. Over time, this database could take on the role as the authoritative resource of protein folding due to the ability of proteomics experiments to quantify folding for large numbers of proteins under uniform conditions.

Specifically, this project utilizes and expands two emerging approaches in structural proteomics and seeks to address two far-reaching questions about protein folding. Methodologically, the project employs limited proteolysis and crosslinking in order to encode structural information about proteins (as well as their folding intermediates and their misfolded forms following failed attempts at refolding) into cleavage sites and crosslinks, which can be sequenced en masse by mass spectrometry. With these methods, this project will explore the limits of thermodynamic refolding. In particular, proteins from thermophilic organisms and ancient proteins might be expected to be more refoldable than their mesophilic and extant peers. Moreover, chaperones are expected to help rescue the refolding of proteins that could not refold on their own. Experiments conducted as part of this project will explicitly test these hypotheses by using limited proteolysis mass spectrometry to examine the refoldability of a wide range of proteins and with the assistance of a range of several chaperone systems. Secondly, this research aims to uncover the biophysical basis of non-refoldability. When a protein fails to refold, what structure does it assume and what intermediates lead it down that ill-fated path? Experiments will address these questions by using crosslinking mass spectrometry to probe the structural dynamics of refolding proteomes. The results will provide new insight into the topologies of protein free energy landscapes and enable the characterization of kinetically-trapped misfolded species.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

National Science Foundation (NSF)
Division of Molecular and Cellular Biosciences (MCB)
Application #
Program Officer
Wilson Francisco
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Johns Hopkins University
United States
Zip Code