The broader impact/commercial potential of this Small Business Innovation Research (SBIR) Phase II project is that it will help scientists gain a better understanding of the underlying relationships in their data and as a result help humans make better decisions from data. The advantage of big data is that it allows for the possibility of identifying truly meaningful relationships. Exposing these relationships can provide great benefits for many different problem domains, and therefore has high commercial appeal in applications such as fraud detection in credit card or insurance data, drug discovery, and identifying threats to national security in homeland security data. Identifying previously unseen relationships also plays a big part in scientific research and discovery. Great scientific discoveries come from a deep understanding of the world around us. Data can capture these relationships, but this is only useful if a researcher can identify them. This motivates the need for sophisticated tools that can present these relationships in a comprehensible way.

This Small Business Innovation Research (SBIR) Phase II project will address the problem of mining and visualizing very high dimensional and time varying data sets. Modern data sets can have many thousands of attributes which can introduce a significant amount of noise and conflicting relationships. This problem is compounded in time varying datasets as some relationships can be inconsistent and disappear over time. There is hence a great need to find and explain temporally consistent and reliable relationships in large data sets. This Phase II project will accomplish this by developing a set of novel interactive visualizations that will find these consistent relationships and explain how they change over time. It will use causal analysis and also extend it to the temporal domain in which causes with delayed effects will be identified (e.g. smoking causes cancer after many decades). Finally, as these data mining techniques can be very computationally expensive, this Phase II project will develop useful optimizations to make them more efficient. The company expects that the results of this research will enable scientists and consequently end users to identify previously unseen relationships, leading to new discoveries.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2019-08-01
Budget End
2021-07-31
Support Year
Fiscal Year
2019
Total Cost
$759,343
Indirect Cost
Name
Akai Kaeru, LLC
Department
Type
DUNS #
City
New York
State
NY
Country
United States
Zip Code
10128