Digital data from social media, online searches, and smartphones can reveal a detailed narrative about an individual's day-to-day activities. Information about lifestyle and health behaviors (e.g. exercise habits, food consumption, smoking status) are often revealed with significant detail through these electronic platforms. Of importance, many commonly shared health behaviors may be associated with cardiovascular disease, treatment, and management. Digital phenotypes derived from these electronically mediated data sources can shape our assessment of human illness and have substantial value beyond our traditional approaches to characterizing a disease phenotype (e.g. physical exam, laboratory values), and ultimately expand our ability to identify and diagnose health conditions and predict healthcare utilization. Central to this proposal is the recognition that person-to-person communication and online activities that were previously private are now observable. It is the observability of these new communication channels that provides both innovation and promise to this area of inquiry.
Our first aim will entail consenting patients to share access to their digital data (e.g. social, search, and mobile data) and merge this information with validated health record data in a research database. We will then extensively process the digital data so that it is in an interpretable format that can be incorporated in traditional predictive models.
Aim 2 will focus on assessing the incremental benefit of adding digital data to the Framingham risk score to evaluate the contribution of digital data for predicting cardiovascular risk. In the future, this data could inform patients about their personalized risk and ways to concretely change that risk. The digital platforms used to post or share data could also be used to directly provide feedback to patients on the medium they use and in direct response to their stated inputs.
The third aim will focus on incorporating digital data in models to predict cost of care. This approach offers promise for better understanding the factors contributing to healthcare utilization, which correlate with morbidity, mortality, and the economic burden of cardiovascular disease. Through this project we seek to learn new insights about collecting and analyzing digital data while being attentive to issues of ethics and privacy that may be associated with these data. We will incorporate digital data in models to predict important targets like coronary heart disease risk and healthcare use. Overall, the areas of focus for this grant represent new frontiers in precision medicine and digital phenotyping for cardiovascular health.

Public Health Relevance

This proposal specifically seeks to outline the challenges and opportunities in harnessing and interpreting digital media data (e.g. social media, online search, smartphone data) to improve the health of patients with cardiovascular disease. Further exploration in this area would provide a better understanding of how patients and physicians can use and respond to health related digital data to improve cardiovascular health.

National Institute of Health (NIH)
National Heart, Lung, and Blood Institute (NHLBI)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Computing and Health Informatics Study Section (BCHI)
Program Officer
Campo, Rebecca A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Pennsylvania
Emergency Medicine
Schools of Medicine
United States
Zip Code