The decreasing cost of information technologies has rapidly enabled the collection, storage, and application of highly sensitive personal information in healthcare environments, which until recently, were dependent on paper documentation, face-to-face interactions, and physical protections for all matters trust-related. As these environments migrate to the electronic setting, it is imperative, as well as our legal and social obligation, to protect the privacy of patients"""""""" electronic health records (EHRs) from threats that are external, as well as internal, to healthcare organizations (HCOs). For the most part, the medical informatics and computer science communities have focused on the external threat, which has led to the development of sophisticated information and computer security mechanisms. However, the internal threat has been neglected, mainly due to the dynamic nature of complex HCOs, such as large distributed medical centers. One of the most significant challenges of data protection in HCOs is that we cannot limit service providers'access to the records in mission critical settings. Consider when a hospital patient requires treatment and a care provider's access to their EHR is delayed or denied, the patient may suffer considerable harm or death. Federal regulations, such as the Security Rule of the Health Insurance Portability and Accountability Act, require HCOs to stockpile access logs, but there are no clear mechanisms for auditing beyond simple manual spot checks, which are limited in scope. Thus, the overarching goal of this project to develop automated methods to data mine EHR access logs to detect when potentially privacy-violating accesses have been committed, so that the appropriate authorities may be alerted to follow-up with an investigation. Our primary goal is to develop informatics tools to monitor how users (e.g., physicians) access the records of subjects (e.g., patients) in the system and flag potentially privacy-compromising actions (e.g., an unauthorized """"""""peek""""""""). The proposed tools will integrate HCO knowledge and access log repositories to represent the system as a dynamic social network of teams and business processes that are applied to score the """"""""safety"""""""" of each recorded access. The specific objectives of the proposed project are (1) to develop a scientific foundation for automatically learning and modeling the normal business operations of HCOs from EHR access logs, (2) to automatically detect EHR accesses that are suspicious in the context of learned HCO operations, (3) to evaluate our approach with expert feedback, and (4) to implement our approaches in an extendable software tool that is rapidly reconfigurable to any EHR system. In support of these goals, we will evaluate real world access logs from the EHR system of the Vanderbilt University Medical Center, which is a detailed repository with data covering tens of thousands of users and over a million patients. We believe that auditing tools for EHR systems, such as those developed through this research, are crucial to the continued adoption of health information technologies without sacrificing patients'privacy rights.

Public Health Relevance

To ensure wide-scale adoption of electronic health records (EHRs), it is crucial to apply technologies that uphold patient privacy in the face of complex demands and healthcare operations. In this research, we will develop technologies to assist healthcare administrators' surveillance efforts by monitoring EHR access logs. The specific goals of this project are to (1) develop a scientific foundation for learning business operations from EHR access logs, (2) tailor data mining techniques to detect anomalous accesses, (3) evaluate our approaches with real logs and healthcare experts, and (4) deploy a reconfigurable and extendible software toolkit to support access log mining and surveillance for any EHR system.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Sim, Hua-Chuan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Vanderbilt University Medical Center
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code
Li, Thomas; Gao, Cheng; Yan, Chao et al. (2018) Predicting Neonatal Encephalopathy From Maternal Data in Electronic Medical Records. AMIA Jt Summits Transl Sci Proc 2017:359-368
Chen, You; Lorenzi, Nancy M; Sandberg, Warren S et al. (2017) Identifying collaborative care teams through electronic medical record utilization patterns. J Am Med Inform Assoc 24:e111-e120
Hedda, Monica; Malin, Bradley A; Yan, Chao et al. (2017) Evaluating the Effectiveness of Auditing Rules for Electronic Health Record Systems. AMIA Annu Symp Proc 2017:866-875
Gao, Cheng; Kho, Abel N; Ivory, Catherine et al. (2017) Predicting Length of Stay for Obstetric Patients via Electronic Medical Records. Stud Health Technol Inform 245:1019-1023
Chen, Robert; Sun, Jimeng; Dittus, Robert S et al. (2016) Patient Stratification Using Electronic Health Records from a Chronic Disease Management Program. IEEE J Biomed Health Inform :
Yan, Chao; Chen, You; Li, Bo et al. (2016) Learning Clinical Workflows to Identify Subgroups of Heart Failure Patients. AMIA Annu Symp Proc 2016:1248-1257
Chen, You; Xie, Wei; Gunter, Carl A et al. (2015) Inferring Clinical Workflow Efficiency via Electronic Medical Record Utilization. AMIA Annu Symp Proc 2015:416-25
Chen, You; Ghosh, Joydeep; Bejan, Cosmin Adrian et al. (2015) Building bridges across electronic health record systems through inferred phenotypic topics. J Biomed Inform 55:82-93
Chen, You; Lorenzi, Nancy; Nyemba, Steve et al. (2014) We work with them? Healthcare workers interpretation of organizational relations mined from electronic health records. Int J Med Inform 83:495-506
Chen, You; Nyemba, Steve; Zhang, Wen et al. (2012) Specializing network analysis to detect anomalous insider actions. Secur Inform 1:

Showing the most recent 10 out of 19 publications