This project is focused on the development of new data analysis tools for analyzing personal data archives, namely, the streams of digital data that are routinely recorded reflecting different aspects of individuals' daily lives. Examples of such data include keystrokes, email histories, text messages, social media interactions, microblogs, as well as records of physical activity, diet, and sleep. As sensors become more accurate and cheaper and as data storage becomes effectively zero cost, there is increasing demand for data analysis tools that allow individuals to analyze and gain insight into their own personal data. This research project is developing new statistical machine learning algorithms for analyzing these types of data. The project has a particular focus on the development of models and algorithms to handle personal archives in the form of event time-series data, consisting of logs of time-stamped events involving interactions with other individuals as well as textual and other metadata. Testbed data sets being used to support this research include publicly-available archives of email histories, software development discussions, Twitter microblogs, Wikipedia editing interactions, and physical proximity data. In terms of broader societal impact, the data analysis tools being developed by this project have the potential to significantly transform how individuals analyze their personal data to better understand and monitor their physical and mental health.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1320527
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2013-10-01
Budget End
2018-09-30
Support Year
Fiscal Year
2013
Total Cost
$499,888
Indirect Cost
Name
University of California Irvine
Department
Type
DUNS #
City
Irvine
State
CA
Country
United States
Zip Code
92697