Machine learning that leverages individuals' event data can improve the prediction accuracy of future events, but introduces high risks to each individual's privacy. Nowadays, large volumes of human event data, such as online TV-viewing records, domain name server queries, and electronic records of hospital admissions, are becoming increasingly available in a wide variety of applications including network analysis and services and healthcare analytics. Predictive modeling of those collective event sequences is beneficial for promoting nationwide economic and safety development. For example, in network traffic diagnosis, the analysis of user activities can be used to predict and control dynamic traffic demand, which improves risk response efficiency. In health informatics, the analysis of patient admission events can detect and optimize treatment for individuals at risks, which enhances public health preparedness and healthcare outcomes. However, by optimizing for the unitary goal of accuracy, machine learning algorithms trained on historic event data may amplify privacy risks. Studies have demonstrated that it is possible to infer private attributes such as demographics and locations from human activities such as online browsing histories and location check-in events. This project is to develop a trusting-based machine learning framework that better protects human privacy while minimally impacting utility for predicting dynamic events. Research and education on interdisciplinary topics of machine learning and privacy are integrated in curriculum development, student research projects, and academic seminars.
The project develops a series of novel models and algorithms to analyze dynamic human events in three synergistic research thrusts. (1) Besides time-stamped event sequences, additional marker information such as event types and tags can be utilized to better capture the dependencies between events. This project investigates novel point processes, multi-view learning, and deep learning methods for analyzing dynamic human events with event marker information. (2) To improve human understanding and trust of predictive modeling, the project develops interpretable algorithms to explain how their information is used in event prediction and what potential private information can be inferred based on their inputs. (3) Balancing between privacy and utility is of mutual benefit to both individuals and service providers. This project investigates a user-specific privacy-preserving approach for event prediction and addresses utility-privacy tradeoff by formulating it as a min-max optimization problem. These three research aims are complemented by a comprehensive evaluation in a number of application domains.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.