The investigators study a new class of statistical methods for learning time series and graphical models. Their approach is based on spectral analysis and matrix decomposition methods that have enjoyed tremendous success in applications, but their use in graphical models has drawn less attention. The goal of this investigation is to extend the enormous previous successes of matrix decomposition methods to the realm of more complicated time series and certain graphical models, which will lead to new statistical machine learning algorithms with important practical applications.

In the information age, an important measure of computer intelligence is the ability to analyze huge amount of data that become available electronically, and make critical decisions under uncertain environment. Statistical machine learning is the main technique for analyzing electronic data, and graphical models are mathematical tools for understanding these complex data both by computer systems and by human operators in order to facilitate decision making. However, traditional algorithms for learning graphical models have limitations that restrict capabilities of modern computing systems. The current research attempts a new class of mathematical algorithms that can be used to design more effective graphical models, which in turn allows modern computers to analyze data more accurately and achieve higher level of intelligence.

Project Report

This project is to design data analysis methods that can extract meaningful latent information from big data. In real world applications, we have to employ machine learning algorithms using complex statistical models with many parameters to discover useful information from data. However, these problems are challenging because the number of parameters is very large. The proposed research is to employ spectral analysis and matrix decomposition methods that have been applied to a wide range of complex large scale data applications, including clustering (both in terms of analyzing graph structure and in terms of finding low dimensional projections), web search (such as the page rank algorithm and the Hubs and Authority algorithm), latentsemantic indexing, collaborative filtering (e.g. the NetFlix challenge), matrix completion, and multi-task learning. Our investigation extends the enormous previous successes of matrix decomposition methods to the realm of more complicated time series and graphical models that are frequently encountered in practical applications. Furthermore, based on the wide success and relative simplicity of spectral methods, we developed widely applicable tools in time-series and more complicated models. This research openned the door to a new body of big data analysis techniques through the use of spectral methods. We have published more than twenty scientific papers and disseminated our results via academic conferences, web sites, courses and lectures. Our methods lead to more meaningful models for discovering hidden information from data and faster computational time in a wide range of data analysis applications. These techniques have been used in the industry, and have made significant impact on our modern society where big data and thus methods to analyzing these data become more and more important.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1016061
Program Officer
Todd Leen
Project Start
Project End
Budget Start
2010-08-01
Budget End
2014-07-31
Support Year
Fiscal Year
2010
Total Cost
$224,646
Indirect Cost
Name
Rutgers University
Department
Type
DUNS #
City
Piscataway
State
NJ
Country
United States
Zip Code
08854