As huge volumes of unlabeled data are generated and made available in many domains, annotating data in these domains becomes burdensome and creates a major bottleneck in maintaining machine-learning databases. This project will investigate a family of transfer-learning methods as an automatic annotation tool, without human involvement, in annotating data for various machine-learning settings. The novelty of the project's transfer-learning approach is based on the common concept of matching-based optimization technique to solve the different forms of transfer learning. The optimization will be carried out using transformations at different levels for different forms. The planned transfer-learning framework will exploit lots of unlabeled data or a few labelled data in the target domain and prior knowledge in the form of labelled source data, source models or other auxiliary information in the source domain. Using this common matching-based optimization framework, this will bring out a natural transition from low-level, sample-based matching to high-level, model-based matching for the different forms of transfer learning. The family of transfer learning methods will have promising ramifications in diverse areas such as intelligent robots and self-driving cars so that they operate efficiently in new and changing environments without the need of large amount of annotated data in the new environments.

This project will investigate two major forms of transfer learning -- domain adaptation and few-shot learning. The research will focus on studying the effect of the proposed matching-based optimization technique to solve the different forms of transfer learning. The project will focus on three major tasks, depending on what information is available in each task: (Task 1) Unsupervised domain adaptation, where the source-domain data is labelled while the target-domain data is unlabeled. In this case, the project team will investigate the optimization based on matching each source-domain sample with each target-domain sample to learn a generalizable target model; (Task 2) Hypothesis transfer learning, where the source and the target domain tasks are different, and only source models and sparsely labelled target domain data will be used to learn a generalizable target model. The model will be learned using matching between source models and target-domain samples; (Task 3) Few-shot learning, where the goal is to learn a generalizable target model from a few labelled samples in the target domain by utilizing auxiliary source knowledge. The project team will study whether transformation between source-model parameters can be substituted as useful auxiliary source-domain knowledge. Hence, the planned research will minimize the requirement of obtaining lots of labelled samples used in machine learning, and it will realize robust learning systems that are generalizable across tasks and domains. Furthermore, since the matching is carried out among each individual sample/model of information locally and explicitly, the results are expected to be better than previous methods.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2018-08-15
Budget End
2021-07-31
Support Year
Fiscal Year
2018
Total Cost
$515,938
Indirect Cost
Name
Purdue University
Department
Type
DUNS #
City
West Lafayette
State
IN
Country
United States
Zip Code
47907