This research project is concerned with developing efficient Monte Carlo algorithms for rare event simulation and the associated large deviations theory. We consider two broad classes of problems where rare events are either of direct interest or a determining factor of the performance of the Monte Carlo scheme. For the first class of problems, importance sampling and particle branching methods have proven to be powerful tools. A unifying approach for the design of these types of schemes is to exploit an important connection between well-designed schemes and the subsolutions to an associated Hamiltonian-Jacobi-Bellman equation. The approach has been successfully applied in the setting of piecewise homogenous dynamics such as queueing networks. The research project aims to consider more complicated models such as small noise diffusions with fast oscillating components where the dynamics are fully nonlinear. With regard to the second class of problems, a particularly important topic is the approximation of the invariant distribution for systems with multiple metastable states by the occupation measure of a related Markov process. Moving from one metastable state to another is a rare event, and its treatment is the key question in the design of efficient Monte Carlo schemes. There are many ad hoc algorithms available. However, these algorithms do not always work well and have to be applied with some care. The research project will rigorously analyze some existing algorithms as well as design new ones with better performance. Two classes of fast simulation schemes that are of particular interest for this problem are parallel tempering and importance sampling.

In many branches of science, such as biology, chemistry, physics, and engineering, the study of rare events or events with very little chance of happening is often of central interest. For example, in the study of proteins or biomolecules, physics-based models are employed to study the interaction of atoms or molecules. Due to the complexity of the model, analytical calculation is impossible, and the primary tool of analysis is simulation, which provides valuable insight into the dynamic evolution of the system. However, the simulation can take an exceedingly long time before the system moves from one configuration to another (a rare event). In the context of highly reliable and secure systems, the rare event is something to be avoided, and accurate assessment is a key tool for purposes of design. To accelerate simulations in situations with important rare events, many ad hoc algorithms have been proposed. For the many schemes that lack a firm theoretical foundation, key design quantities are usually selected only on the basis of prior experience. As a consequence, these schemes may work in specialized situations, but can also perform quite poorly in general. This research project has two goals. One is the rigorous analysis of existing schemes, which can be very useful in understanding the power of the schemes as well as their limitations. The second goal is to develop, based on the analysis of existing approaches, new schemes whose performance is provably better than the existing ones. This work will be of use not only to theoreticians who are interested rare event simulation, but also to a large community of practitioners and scientists who use simulation as a basic tool for their research.

Project Report

Rare events, such as transitions between stable wells in a model from chemical physics, data loss in a highly reliable communications system, or unexpectedly large payouts in insurance claims, are often key quantitative measures of a system’s overall behavior. The main goal of this project was the development of reliable numerical methods to estimate probabilities and expected values that are largely determined by rare events. A broadly useful approach to these problems is the Monte Carlo method, which approximates these quantities by simulating instances of the event and then defining mathematically correct estimators based on these random simulations. There are, however, many ways this can be done, and while it is possible to design more efficient schemes (also called accelerated schemes), it is also possible to devise reasonable looking schemes that actually are very poor. For these reasons, a proper mathematical foundation for and analysis of the methods is essential. The common theme behind the research of this project was to use ideas from the theory of large deviations (a very active part of the modern theory of probability) for the design and analysis of accelerated Monte Carlo methods. In addition, the underlying large deviation theory itself was developed where needed. Two broad classes of problems were considered. The first class were problems where some probability or expected value requires numerical approximation (e.g., the failure probability of a communication system), and this value is largely determined by the distributional properties of just one or a few rare events. The second class of problems are those where one is not particularly interested in the rare events themselves, but instead one is concerned with the adverse impact rare events have on the performance of a proposed Monte Carlo scheme. The problem of interest here is the approximation of an invariant probability distribution by empirical measures. Problems of this kind occur in many important problems, such as numerically testing the properties of complex materials before attempting to build them in a lab, to Bayesian methods for the analysis of large scale data. The goal here was to develop large deviation theory for empirical measures as a tool for the analysis and design of existing schemes, and also to generate new approximation methods. With regard to the first class, the main goal was to extend previous work of the PIs to more complex models of the sort that one would find in biology and chemistry, and in particular complicated multiscale process dynamics (i.e., processes where important statistical relations of the process occur at widely different length scales). Provably sound accelerated methods were developed and implemented. As a precursor, the underlying large deviation theory needed to understand the simulation problem was developed. With regard to the second class, the goal was to take a quantity known as the "large deviation rate function," which in some sense directly measures the probability of large errors in the numerical approximation, and use it as a "rate of convergence" for the design of good computational schemes. Starting with a method known as parallel tempering, which is the most widely used general purpose method available for large scale problems, we developed a theoretically and practically superior algorithm that we call "infinite swapping," as well as nearly optimal implementations that are both theoretically and practically superior that we call "partial infinite swapping." A module which allows other researchers to experiment with the infinite swapping and partial infinite swapping algorithms was added to a widely used web site.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1008331
Program Officer
Henry Warchall
Project Start
Project End
Budget Start
2010-09-01
Budget End
2013-08-31
Support Year
Fiscal Year
2010
Total Cost
$280,000
Indirect Cost
Name
Brown University
Department
Type
DUNS #
City
Providence
State
RI
Country
United States
Zip Code
02912