Volunteer computing (VC) uses donated computing time consumer devices such as home computers and smartphones to do scientific computing. It has been shown that VC can provide greater computing power, at lower cost, than conventional approaches such as organizational computing centers and commercial clouds. BOINC is the most common software framework for VC. Essentially, donors of computing time simply have to load BOINC on their computer or smartphone, and then register to donate at the BOINC web site. VC provides "high throughput computing": handling lots of independent jobs, with performance goals based on the rate of job completion rather than completion time for individual jobs. This type of computing (all known as high-throughput computing) is in great demand in most areas of science. Until now, the adoption of VC has been limited by its structure. For example, VC projects (such as Einstein@home and Rosetta@home) are operated by individual research groups, and volunteers must browse and choose from among many such projects. As a result, there are relatively few VC projects, and volunteers are mostly tech-savvy computer enthusiasts. This project aims to solve these problems using two complementary development efforts: First, it will add BOINC-based VC conduits to two major high-performance computing providers: (a) the Texas Advanced Computing Center, a supercomputer center, and (b) nanoHUB, a web portal for nano science that provides computing capabilities.Also, a unified control interface to VC will be developed, tentatively called Science United, where donors can register. The project will benefit thousands of scientists who use these facilities, and it will create technology that makes it easy for other HPC providers to add their own VC back ends. Also, Science United will provide a simpler interface to BOINC volunteers where they will register to support scientific areas, rather than specific projects. Science United will also serve as an allocator of computing power among projects. Thus, new projects will no longer have to do their own marketing and publicity to recruit volunteers. Finally, the creation of a single VC "brand" (i.e Science United) will allow coherent marketing of VC to the public. By creating a huge pool of low-cost computing power that will benefit thousands of scientists, and increasing public awareness of and interest in science, the project plans to establish VC as a central and long-term part of the U.S. scientific cyber infrastructure.

Adding VC to an existing HPC facility involves several technical issues, which will be addressed as follows: (1) Packaging science applications (which typically run on Linux cluster nodes) to run on home computers (mostly Windows, some Mac and Linux): the team is developing an approach using VirtualBox and Docker, in which the application and its environment (Linux distribution, libraries, executables) are represented as a set of layers comprising a Docker image, which is then run as a container within a Linux virtual machine on the volunteer device. This has numerous advantages: it reduces the work of packaging applications to near zero; it minimizes network traffic because a given Docker layer is downloaded to a host only once; and it provides a strong security sandbox so that volunteer computers are protected from buggy or malicious applications, (2) File management: Input and output files must be moved between existing private servers and public-facing servers that are accessible to the outside Internet. A file management system will be developed, based on Web RPCs, for this purpose. This system will use content-based naming so that a given file is transferred and stored only once. It also maintains job/file associations so that files can be automatically deleted from the public server when they are no longer needed. (3) Submitting and monitoring jobs: BOINC provides a web interface for efficiently submitting and monitoring large batches of jobs. These were originally developed as part of a system to migrate HTCondor jobs to BOINC. This project is extending it to support the additional requirements of TACC and nanoHUB. Note that these new capabilities are not specific to TACC or nanoHUB: they provide the glue needed to easily add BOINC-based VC to any existing HTC facility. The team is also developing RPC bindings in several languages (Python, C++, PHP). The other component of the project, Science United, is a database-driven web site and an associated web service for the BOINC clients. Science United will control volunteer hosts (i.e. tell them which projects to work for) using BOINC's "Account Manager" mechanism, in which the BOINC client on each host periodically contacts Science United and is told what projects to run. Project servers, not Science United, will distribute jobs and files. Science United will define a set of "keywords" for science areas (physics, biomedicine, environment, etc.) and for location (country, institution). Projects will be labelled with appropriate keywords. Volunteers will have a yes/no/maybe interface for specifying the types of jobs they want to run. Science United will thus provide a mechanism in which a fraction of total computing capacity can be allocated to a project for a given period. Because total capacity changes slowly over time, this allows near-certain guaranteed allocations. Science United will embody a scheduling system that attempts to enforce allocations, honor volunteer preferences, and maximize throughput. Finally, Science United will do detailed accounting of computing. Volunteer hosts will tell Science United how much work (measured by CPU time and FLOPs, GPU time and FLOPs, and number of jobs) they have done for each project. Science United will maintain historical records of this data for volunteers and projects, and current totals with finer granularity (e.g. for each host/project combination). Finally, Science United will provide web interfaces letting volunteers see their contribution status and history, and letting administrators add projects, control allocations, and view accounting data.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Type
Standard Grant (Standard)
Application #
1664022
Program Officer
Stefan Robila
Project Start
Project End
Budget Start
2017-05-15
Budget End
2020-07-31
Support Year
Fiscal Year
2016
Total Cost
$500,000
Indirect Cost
Name
University of Texas Austin
Department
Type
DUNS #
City
Austin
State
TX
Country
United States
Zip Code
78759