Serverless functions, or Functions as a Service (FaaS), are a cloud computing feature whose popularity has been increasing in recent years. This project will improve serverless functions with a sophisticated runtime system that will allow users to run code efficiently while keeping serverless functions economically viable to providers. While keeping the programming model simple, a more sophisticated runtime will provide features such as efficient caching of intermediate results and fault tolerance. Meanwhile hardware acceleration (e.g., graphical processing units (GPUs)) will be transparently enabled. As a consequence, serverless functions will be made efficient for new classes of workloads such as video processing and machine learning inference.
Achieving efficient execution with a simple programming model requires a technically sophisticated runtime system. Organizing the computation as a data flow graph allows the user to provide only simple data dependencies while the runtime simultaneously schedules local storage and computational accelerators along with more traditional resources such as the Central Processing Unit (CPU) cores and memory. Serverless workloads require high parallelism and short run times to make the platform worthwhile. However, maintaining high levels of parallelism can be difficult because of input-dependent processing requirements and GPU acceleration. Load imbalance arises when the stages specified in a data flow graph have data-dependent processing requirements. This is common in some machine learning (ML) related tasks, e.g., face recognition. GPUs may make the problem worse because a data flow graph that is balanced for CPU execution might become unbalanced when some stages are executed on a GPU where execution is much faster.
This project will provide the necessary tools, techniques, and infrastructure to bring serverless functions to new workloads with unprecedented levels of performance. This allows the continued exponential evolution and innovation for systems that rely on machine learning and other compute-intensive computations. This project will also provide an opportunity for doctoral students to work as graduate research assistants while gaining broad exposure to interdisciplinary research that draws from multiple areas of computer science, including operating systems, virtualization and GPUs.
Results from this project will be made public where they can be archived. All published material from the project will be distributed for free from the authors' web site. Research artifacts are likely to include modified source code and workloads. Research publications will be available at www.cs.utexas.edu/users/witchel/. Source code, workloads, and other artifacts will be available at https://github.com/ut-osa/.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.