Recent advances in big data infrastructures and algorithm foundations have unleashed a torrent of data being collected and stored in distributed data centers all over the world. The ever-increasing availability of these massive datasets leads to many machine learning tasks that are inherently related. Therefore transfer learning paradigms have been developed in the past decade to perform knowledge transfer among tasks to improve their generalization performance. This project will develop a suite of large-scale lifelong learning methods to address significant challenges from knowledge transfer on big data. The algorithms and tools developed in this project will directly impact biomedical informatics and intelligent transportation systems, as they will be used to build personalized predictive models from electronic medical records and traffic state models from big traffic data. The success of this project will be used to develop a new curriculum that incorporates research into the classroom and provides students from under-represented groups with opportunities to participate in machine learning research.

The properties of velocity, volume, variability, and variety that characterize big data have imposed significant challenges in the traditional lifelong learning approaches. This project will advance lifelong learning by (1) developing a distributed life-long learning framework to enable online knowledge transfer on large-scale distributed datasets; (2) designing effective methods to track temporal drifting in the task relationship, and leverage human knowledge via interactive transfer; and (3) investigating strategies that enable the distributed life-long learning to handle heterogeneities from both feature spaces and learning tasks. The results of this project will have an immediate and strong impact on Big Data theoretical and algorithmic foundations, by enabling a large-scale lifelong learning framework readily available for many Big Data analytics. All findings, publications, software, and data will be made publicly available at the project website: http://jiayuzhou.github.io/projects/career.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1749940
Program Officer
Wei Ding
Project Start
Project End
Budget Start
2018-08-01
Budget End
2023-07-31
Support Year
Fiscal Year
2017
Total Cost
$433,151
Indirect Cost
Name
Michigan State University
Department
Type
DUNS #
City
East Lansing
State
MI
Country
United States
Zip Code
48824