The ongoing effort to move data intensive computation to low-cost public clouds has been impeded by privacy concerns, as today's cloud providers offer little assurance for the protection of sensitive user data. This problem cannot be addressed by existing cryptographic techniques alone, which are often too heavyweight to manage the computation involving a large amount of data. As a result, many computing tasks have to be run on individual organizations? internal systems whenever they touch even a very small amount of sensitive information.
The research in this project seeks practical solutions to this critical security challenge. The PIs are working on an approach to split a computing job over a hybrid-cloud platform, delegating to a public cloud the computation over public data, while keeping the computation on sensitive data within a private cloud.
Specifically, the PIs are developing a privacy-aware MapReduce system, which transparently partitions a computing job and schedules its components across the public/private clouds according to the security levels of the data involved. The system is designed to achieve high security assurance and outsource most of its workload when possible, at small computational and communication overheads. It includes support for analyzing and transforming the code for legacy jobs as well as developing new jobs. We are also working to extend these techniques to facilitate other secure work-flow processing over hybrid clouds. This research involves industry collaborators and contributes to secure processing of a wide range of computing jobs, from commercial data analysis, to DNA analysis, to intrusion detection.