The understanding of cooperative processes in biomolecular systems (such as protein dynamics, folding, and self-assembly) poses outstanding challenges both for theory and experiment. The large number of degrees of freedom involved, and the degree of heterogeneity present, may at a first glance suggest the lack of general principles.
As Francis Crick wrote twenty years ago, commenting on macromolecular dynamics, "what seems to physicists a hopelessly complicated process may have been what Nature found simplest". The behavior of a macro-molecular system appears overwhelmingly complicated. Biologically relevant macro-molecular systems are the result of billions of years of evolution, during which details and exceptions have been selected for functional reasons. In spite of the complexity, collective phenomena emerge in macromolecular systems, as for instance in protein folding and self-assembly processes, suggesting the existence of organizing principles that may actually exploit the complexity to obtain simplicity.
Is it possible to understand (that is, reproduce, quantify, and predict) how organization emerges from the interactions of the single degrees of freedom in a biomolecular system, over a broad spectrum of length and timescales? Empirical and theoretical evidence supports the idea that for most macromolecular processes only a small portion of the conformational space is visited, and that for medium/long time scales a very small number of parameters are enough to describe the coarse dynamics of a large macromolecular system. Previous work in this direction has not been automatic or systematic, and has been driven mostly by physical intuition, with little or no guarantee of success.
It is the goal of this work to develop and apply a radically different approach, that reconciles biological and biochemical approaches with a physical and mathematical perspective. A key step towards the formulation of the general "rules" that evolution has employed for regulating the behavior of biomolecular processes resides in the mathematically rigorous identification and the physically sound interpretation of the minimal set of effective parameters needed to faithful reproduce the macromolecular process of interest. The methods that will be developed to this end are based on multiscale geometric measure theory, harmonic analysis, and dimensionality reduction. The core of these ideas will be widely applicable to the analysis of the geometry of large high dimensional data sets, across and beyond biophysics, leading to novel general paradigms for dimensionality reduction and regression on such data sets.