The broader impact/commercial potential of this Small Business Innovation Research (SBIR) Phase I project will be the creation of a new tool that could prevent the loss of sensitive data stored in big data management systems due to various cyberattacks. Furthermore, the proposed tool can allow organizations to audit big data usage to prevent abuse and misuse of the stored data. The existence of such a novel tool may increase trust in these big data management systems, and protect the sensitive data stored in such systems against various outsider and insider attacks. The company believes that such a tool would address an important customer need and has the potential to have significant commercial impact as more and more companies are adopting big data management technologies such as Hadoop and Spark. The company plans to pursue a freemium business model and open source some of the developed code. This in turn may improve the data protection capabilities provided by existing freely available open source tools that can be used by many different companies and organizations.
This Small Business Innovation Research (SBIR) Phase I project will prove the feasibility of a novel big data privacy, security and governance management tool. This new tool will provide enhanced security and privacy protection capabilities such as enforcing privacy policies using on-the-fly data masking, enforcing security policies using role-based access control techniques, and enforcing governance policies using data encryption, and advanced auditing and accountability features in one tool without the need to modify/change the underlying big data management system. To successfully develop the proposed prototype, the company will address many technical challenges such as developing efficient privacy-preserving policy enforcement solutions with very little overhead, and designing an interactive user interface that supports easy governance and privacy policy specification tasks. To address these technical challenges, the company proposes to leverage recent advances in aspect oriented programming to inject code directly into submitted data analysis jobs in a seamless manner to enable transparent data encryption, data sanitization, and accountability, compliance and governance policy enforcement. Using this injected code, the data that is stored in encrypted format could be decrypted and sanitized before it is used for data analysis as needed. Furthermore, necessary logs could be generated for accountability purposes.