In this project, the PI is developing theoretical foundations for performing iterative computations on massive data in a distributed environment. Based on the developed theory, the PI aims to build highly scalable and efficient distributed frameworks for iterative computations. The distributed framework takes the burden of describing the iterative process away from programmers and performs the iterative updates in an efficient manner. A series of programming models will be developed aiming to challenge the conventional wisdom that synchronization is essential and iterative computations have to be performed in an ?iteration by iteration? manner. The goal of these proposed programming models and supporting distributed frameworks is to lift the burden of the programmers in specifying execution order of iterative updates and communication mechanisms, and automatically optimize the execution of the computation in a cluster of machines.
The technologies developed in this project will have immediate important applications with broader societal impacts such as road traffic prediction, biological information discovery, online marketing, and computer forensic analysis.