The broader impact/commercial potential of this Small Business Innovation Research (SBIR) Phase I project will be to offer a unifying backend that will enable seamless and scalable visual data access for Machine Learning (ML) deployments. This is achieved by removing the need for a fragmented system design from numerous independent products in an otherwise unified pipeline. Improvements in ML have made it possible for businesses to extract insights from information-rich visual data such as images and video. Handling big-visual-data for ML and query purposes requires storage and access methods designed for visual ML. With the current off-the-shelf alternatives, ML engineers and data scientists are forced to merge unprepared data solutions to address visual data management. Businesses pay the technical debt in the form of a) extra data platform resources, b) talent mismatch when ML engineers and data scientists are forced to engineer infrastructure, and c) delayed product launches. This project creates a unified system with one solution across the various stages of ML starting from data collection, curation, to training, inference, and business queries.
This Small Business Innovation Research (SBIR) Phase I project will lead to a novel data management platform designed for large-scale visual data, with an interface specialized for Machine Learning (ML) and Expert Insights queries. The project aims to build infrastructure to support thousands of concurrent clients, trillions of metadata entities, and petabytes of visual data, as will be common in the domains with increasing use of visual data. The platform is scalable without affecting performance, particularly for the emerging area of visual data management for ML deployments. The work will include visual data storage when receiving data from a large number of IoT-like devices and a ML-aware application programming interface for low-latency, high-throughput access of big-visual-data. The scalable metadata database is designed to exploit new memory technology and novel caching and tiering methods using content-based knowledge of image/video data via novel formats.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.