Machine learning systems are operating at increasing scale in ways that benefit nearly all areas of human activity, from improved voice recognition, search and advertising, automatic language translation, and on the horizon, to activities such as self-driving cars. It is already extremely hard to implement on large, scalable clusters of computers the "inference algorithms" that enable these systems, and future trends of computer data centers will further exacerbate this difficulty: increasingly large numbers of nodes, heterogeneous clusters that mix conventional microprocessors, graphics processors, larger numbers of small and power-efficient microprocessors, and hardware changes such as the introduction of flash-based solid-state disks. The goal of this proposal is to design, analyse, and implement novel inference algorithms that not only take advantage of these trends for high performance , but that also enable future, even-larger-scale systems to be implemented.

The proposal specifically aims to achieve the following:

1. Develop a broad family of novel optimization algorithms for machine learning; 2. Analyse their convergence properties theoretically, as well as empirically; 3. Release open-source code implementing them.

The research proposed is based upon four likely shifts in the design of data centers of the future:

1. Small and power efficient microprocessors with a much improved CPU power to energy consumption ratio will become common in the data centers of the future. 2. Architectures mixing different types of hardware, ranging from computer graphics processors to general purpose multi-core microprocessors are becoming the norm among all major semiconductor manufacturers. These changes will propagate to the data center. 3. Hard disks are increasingly being supplemented and replaced by solid state memory which requires 10,000 to 100,000 times less time to access. 4. Modern network architectures that replace traditional hierarchical tree structures (with inherent bottlenecks) by more balanced layouts are being enabled by software-defined networking and specialized network chips.

All four of these aspects offer considerable potential to design faster machine learning algorithms. Doing so requires tightly coupled algorithmic and systems design that successfully creates algorithms that work well on the kinds of systems that can be built, and systems to be built that provide the right support for machine learning algorithms.

The software developed for this project will be distributed as open source.

For further information see the project web site at:

National Science Foundation (NSF)
Division of Information and Intelligent Systems (IIS)
Application #
Program Officer
Aidong Zhang
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Carnegie-Mellon University
United States
Zip Code