The importance of understanding interactions among social, behavioral, environmental, and genetic factors and their relationship to health has led to greater interest in studying these determinants of disease in the biomedical research community. While some knowledge exists regarding contributions of specific determinants such as socioeconomic status, educational background, tobacco and alcohol use, and genetic susceptibility to particular diseases or conditions, enhanced methods are needed to analyze and ascertain interrelationships among multiple determinants and to discover potentially unexpected relationships that may ultimately contribute to improving patient care and population health. The increased adoption of electronic health record (EHR) systems has the potential for enhanced collection and access to a wide range of information about an individual's lifetime health status and health care to support a range of """"""""secondary uses"""""""" such as biomedical, behavioral and social science, and public health research. Traditionally, clinicians document an individual's health history in clinical notes, including social and behavioral factors within the """"""""social histor"""""""" section and familial factors in the """"""""family history"""""""" section. While some EHR systems have specific modules for collecting social and family history in structured or semi-structured formats, a large amount of this information is recorded primarily in narrative format, thus necessitating the need for automated methods to facilitate the extraction and integration of social, behavioral, and familial factors for subsequent uses. Once extracted, knowledge acquisition and discovery methods can be applied to both confirm known relationships relative to specific diseases or conditions as well as to potentially discover new relationships. We hypothesize that advanced computational methods can transform social, behavioral, and familial factors from the EHR into a rich longitudinal resource for generating knowledge regarding various determinants of health including their temporal progression, severity, and relationship to health conditions. Towards this goal, the specific aims are to: (1) develop comprehensive information models and natural language processing (NLP) techniques to represent, extract, and integrate social, behavioral, and familial factors from social and family history information in the EHR, (2) adapt and extend data mining techniques to identify non-temporal and temporal relationships among these factors and diseases, and (3) evaluate and validate known and candidate new relationships for specific conditions (pediatric asthma and epilepsy). This multi-site proposal will involve a transdisciplinary team of investigators from the University of Vermont and University of Minnesota, use of EHR data from both institutions, and collaborative development and evaluation of the NLP and data mining techniques. Ultimately, this work has the potential to provide a generalizable approach for supporting and enhancing existing knowledge regarding the interactions among social, behavioral, and familial factors and diseases.

Public Health Relevance

The ability to systematically collect and analyze social, behavioral, and familial factors from the electronic health record using automated methods could assist in developing a rich longitudinal resource for enhancing knowledge regarding the interactions among these factors and diseases. This knowledge could ultimately contribute to improving patient care and population health. !

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Sim, Hua-Chuan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Vermont & St Agric College
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code
Wang, Yan; Chen, Elizabeth S; Leppik, Ilo et al. (2016) Identifying Family History and Substance Use Associations for Adult Epilepsy from the Electronic Health Record. AMIA Jt Summits Transl Sci Proc 2016:250-9
McEwan, Reed; Melton, Genevieve B; Knoll, Benjamin C et al. (2016) NLP-PIER: A Scalable Natural Language Processing, Indexing, and Searching Architecture for Clinical Notes. AMIA Jt Summits Transl Sci Proc 2016:150-9
Rajamani, Sripriya; Chen, Elizabeth S; Akre, Mari E et al. (2015) Assessing the adequacy of the HL7/LOINC Document Ontology Role axis. J Am Med Inform Assoc 22:615-20
Winden, Tamara J; Chen, Elizabeth S; Wang, Yan et al. (2015) Towards the Standardized Documentation of E-Cigarette Use in the Electronic Health Record for Population Health Surveillance and Research. AMIA Jt Summits Transl Sci Proc 2015:199-203
Carter, Elizabeth W; Sarkar, Indra Neil; Melton, Genevieve B et al. (2015) Representation of Drug Use in Biomedical Standards, Clinical Text, and Research Measures. AMIA Annu Symp Proc 2015:376-85
Melton, Genevieve B; Wang, Yan; Arsoniadis, Elliot et al. (2015) Analyzing Operative Note Structure in Development of a Section Header Resource. Stud Health Technol Inform 216:821-6
Chen, Elizabeth S; Carter, Elizabeth W; Winden, Tamara J et al. (2015) Multi-source development of an integrated model for family health history. J Am Med Inform Assoc 22:e67-80
Chen, Elizabeth S; Melton, Genevieve B; Wasserman, Richard C et al. (2015) Mining and Visualizing Family History Associations in the Electronic Health Record: A Case Study for Pediatric Asthma. AMIA Annu Symp Proc 2015:396-405
Chen, Elizabeth S; Sarkar, Indra Neil (2014) Mining the electronic health record for disease knowledge. Methods Mol Biol 1159:269-86
Rajamani, Sripriya; Chen, Elizabeth S; Wang, Yan et al. (2014) Extending the HL7/LOINC Document Ontology Settings of Care. AMIA Annu Symp Proc 2014:994-1001

Showing the most recent 10 out of 15 publications