This project asks: how can one build a cloud storage service under minimal trust assumptions? The project aims to design, implement, and evaluate a practical, concrete system that allows clients to use cloud storage providers like Azure and S3 -- but without the clients having to trust, that is assume, that the providers always operate correctly.
Of course, reducing assumptions is generically good, but doing so is particularly relevant to cloud storage: for economic and operational reasons, data is increasingly migrating to storage service providers (SSPs), yet SSPs are complex black boxes that can experience software bugs, correlated manufacturing defects, misconfigured servers, operator error, malicious insiders, bankruptcy, fires, and more.
For these reasons, the project is building a system, called Padova, that tolerates scenarios in which all servers in the SSP fail, and the failures include malicious, buggy, or improbably unlucky behaviors. The approach is enforce a sensible ordering of updates at the client; this provides the foundation for safety and allows a client to gather updates from any node in the system, client or server, which in turn contributes to liveness and availability. Padova eliminates trust for safety and minimizes trust for liveness and availability.
This project makes cloud storage safer for existing customers and spur further adoption of cloud storage. This means more people paying less for computing, producing beneficial effects throughout the computational ecosystem. The educational benefits include graduate and undergraduate mentoring, the latter through UT's Turing Scholars program, which cultivates particularly talented undergraduates.
MOTIVATION AND INTELLECTUAL PROGRESS This project is motivated by the rise of outsourced, or cloud, computing: companies and individuals now store their data on banks of machines controlled by someone else. Even as this arrangement brings economic benefits, it brings issues. The providers of the outsourced service are complex, making it unlikely that they always operate correctly; moreover, they are black boxes, making it difficult to detect if a problem has occurred. (The problems include corruption of data, misconfiguration, operator error, malice, hardware errors, etc.) Given this context, the goal of the project is remove trust from cloud systems, with a primary focus on storage systems. The ideal (which is not always achievable) is to provide guarantees to cloud customers, and allow them to get useful work out of the cloud service, _even if the service malfunctions in arbitrary ways_. Specific outcomes of the project include the following: * The Depot system. Depot is an implemented software package that allows developers, and implicitly their customers (that is, all of us) to use cloud storage services without having to worry that their data (photos, calendars, etc.) might be lost, stale, corrupted, or compromised. Of course, many things can go wrong with cloud storage, and Depot's approach is not to enumerate all of them but to rely on verifications at the _client_ of the service, which heads off a range of malfunctions. Depot provides its guarantees at minor computational costs. * The Salus system. Despite the encouraging results of Depot, it is limited to modestly-sized data volumes (megabytes or gigabytes per customer). The Salus thrust refines Depot's techniques to reduce their costs. This allows the system to scale to much larger data volumes: in principle, petabytes or exabytes. As part of Salus, the project has developed an emulator that allows researchers to experiment with large-scale storage systems, while using 100x fewer machines than would be required to actually run the system. * The Pantry system. Depot and Salus give clients assurances about the correctness of the storage layer (for example, clients gain assurance that what they read corresponds to what the other clients wrote). But what if, in addition, the cloud itself is processing or computing on the stored data? Can we give assurances to the client_both_ that storage is operating correctly _and_ that the computation is executing correctly? The Pantry thrust shows that such assurances are possible. Specifically, it was known in theory how to provide such guarantees, but the costs were far too expensive for practice. Pantryhas dramatically reduced such costs in the context of an implemented system. While the system is not yet practical, it suggests that these techniques could be practical in the future. IMPACT, DISSEMINATION, AND EDUCATION The project's results, if "productized", could allow customers to use cloud services for storage without having to trust (that is, assume) that the cloud works correctly. The results of the project have been disseminated in competitive, peer-reviewed venues. The project has produced education benefits at a number of levels: graduate, undergraduate, and secondary school. First, the project has provided crucial research training to eight graduate students at UT Austin; these students have gained experience in all aspects of research, and four of them have filed, or are on the cusp of filing,t heir PhD dissertations. Second, one of the PIs routinely involves undergraduates in research projects; indeed, some of the publications that have come out of this project have undergraduate co-authors. Last, one of the PIs has taught a short lecture series on the principles of distributed computing at the magnet high school in Austin.