This Small Business Innovation Research (SBIR) Phase I project will develop novel technologies and a software system which adds fast, and easy-to-use graph analytics to the relational database system (RDBMS). Querying and analyzing massive graph data (graph analytics) not only can enrich and enhance business, social and government intelligence, but also can help significantly advance scientific and technology discovery. However, most of the existing graph data (business, financial, governmental, and social networking data) are stored in the traditional RDBMS that provides little fundamental support for graph analytics. This is because graph analytics requires extensive recursive and traversal operations, which are hard to represent and prohibitively expensive to execute in RDBMS. To address this fundamental challenge, this project will develop a powerful graph engine and transparently connect RDBMS with the graph engine. The anticipated results of the project include novel techniques for building a graph processing engine and connectors between RDBMS and the graph engine, and a beta-version software system, which will be tested and employed by two or three early adopters/customers.

The broader impact/commercial potential of this project will be to help deliver better insights into the complex nature of relationships between people, their decisions and their interactions with others in all areas of businesses and intelligence, such as online social networking, marketing and advertisement, banking and finance, logistics and transportation, telecommunications, healthcare and hospitals, homeland security, bio-engineering and drug discovery, etc. The technologies and software developed in this project can also directly contribute to the five major analytics segments (end-user query and reporting analysis, data warehouse management, financial performance and strategy management applications, custom relationship analytics applications, and data warehouse generation), each with the global revenue of more than one billion dollars.

Project Report

Intellectual Merits: This Small Business Innovation Research Phase I project has developed innovative technologies and a software system, which provide ultrafast and powerful graph analytics of Big Data for enterprise users. Graph analytics is the mining of information from a "graph", a network of linked data nodes, such as a social network. In particular, this project significantly extends the capability of traditional Relational Database Management Systems (RDMS) which currently must store graph data in tabular format, failing to take advantage of the network's link structure. Thus, traditional database systems cannot analyze graphs efficiently. The key outcome of the project is a graph analytics software platform, consisting of both a Graph Processing Engine (GPE) and a Graph Storage Engine (GSE). The existing system runs on multi-core machines (leveraging the multi-core parallelism) and can handle massive graphs with hundreds of millions of nodes and billions of links. In addition, this project has developed a toolkit of basic and key graph analytics functions, such as measuring node importance, finding the shortest path from A to B, finding out if there is any such path, and recommending new links. Finally, we have developed a GraphSQL language and software connectors to traditional database systems. The GraphSQL language can specify graph views over relational tables and blend graph analytics with traditional SQL. The RDBMS connectors serve as the bridge linking the relational DBMS and GraphSQL’s graph analytics platform. Broader Impacts: The technology developed in the project significantly advances the current capability for processing massive graph data, a major challenge for many industries, scientific research, and government agencies. In particular, the project has help to lay the foundation for a new generation of Big Data graph analytics, by introducing new algorithms and architectures that improve graph storage and processing capability by orders of magnitude. By enabling companies and organizations to analyze their complex, massive and hidden relationship data, this project can help them gain deeper and actionable insights. Specifically, the technology developed in this project can directly benefit online networking, marketing and advertisement, banking and financial institutions, logistics and transportation, telecommunications, healthcare and hospitals, and homeland security, among others. During the Phase I project, we have been actively engaging with a list of early customers in these industries to develop customized graph analytics solutions, which can help them to find, attract and retain customers, improve operations, reduce waste, minimize risk, and prevent fraud

Agency
National Science Foundation (NSF)
Institute
Division of Industrial Innovation and Partnerships (IIP)
Type
Standard Grant (Standard)
Application #
1248736
Program Officer
Muralidharan S. Nair
Project Start
Project End
Budget Start
2013-01-01
Budget End
2013-06-30
Support Year
Fiscal Year
2012
Total Cost
$149,993
Indirect Cost
Name
Graphsql Inc
Department
Type
DUNS #
City
Millbrae
State
CA
Country
United States
Zip Code
94030