Despite their adverse impact on patient quality of life and healthcare utilization and costs, symptom clusters (SCs) in common adult chronic conditions such as cancer, heart failure (HF), type 2 diabetes mellitus (T2DM), and chronic obstructive pulmonary disease (COPD) are understudied and poorly understood. The lack of access to real world, longitudinal patient symptom data sets and inability to adequately model the complexity of SCs has greatly limited research. Based on our previous work, we propose that these gaps can be addressed in an innovative way using electronic health records (EHRs) and data science techniques. Our overall objective is to develop, apply and refine, and implement an optimized data processing and analysis pipeline for the characterization of SCs in common adult chronic conditions for use with EHR data. We hypothesize that a core set of SCs is shared among all common adult chronic conditions and that distinct SCs characterize specific conditions and/or treatments. The long term training goal of this project is to assist Dr. Koleck in becoming an independent investigator conducting a program of research dedicated to mitigating symptom burden in patients with chronic conditions through use of informatics and omics (e.g., genomics and proteomics), the focus of her pre-doctoral work. Using exceptional resources available from Columbia University, the K99 phase of this project will focus on the development of a rigorous pipeline; essential competencies in SC analysis and interpretation; and the data science techniques of clinical data mining, natural language processing, machine learning, and data visualization. In the R00 phase, Dr. Koleck will independently implement the pipeline in another medical center to determine the reproducibility of identified SCs and begin to explore clinical predictors (e.g., socio-demographics, laboratory results, and medications) of SCs.
The specific aims are to 1) develop a data-driven pipeline for the characterization of SCs from EHRs using a cohort of adult patients diagnosed with cancer, as SCs have been most systematically characterized in this condition; 2) apply the pipeline to three other common adult chronic conditions that share biological and behavioral risk factors with cancer, i.e., HF, T2DM, and COPD, and evaluate SCs in these conditions; and 3) determine if SCs differ for cancer, HF, T2DM, and COPD when implementing the pipeline within another medical center and explore clinically relevant, EHR- documented predictors of identified SCs. To accomplish research aims and training goals, an interdisciplinary team of scientists with expertise in symptom science, biomedical informatics, data science, pertinent clinical domains, and career development mentorship has been assembled. This research is significant because a pipeline that accommodates the format in which symptom data is already being documented in EHRs has the potential to greatly accelerate the acquisition of SC knowledge and expedite clinical translation of symptom mitigation strategies. Given the array of new competencies to be developed, this K99/R00 award is necessary for achieving the candidate?s career goal of advancing chronic condition symptom science.

Public Health Relevance

The proposed research is relevant to public health because adult patients diagnosed with chronic conditions are frequently burdened by two or more co-occurring, related symptoms. Development of an optimized process to study multiple symptoms collected in patient electronic health records has the potential to lead to new knowledge and improved symptom management. The proposed research addresses the NINR theme of ?symptom science? and a key area in the NINR Strategic Plan, ?taking advantage of innovations in data science in order to develop interventions to promote health and wellness that are leading-edge, effective, and translatable to clinical practice.?

National Institute of Health (NIH)
National Institute of Nursing Research (NINR)
Career Transition Award (K99)
Project #
Application #
Study Section
National Institute of Nursing Research Initial Review Group (NRRC)
Program Officer
Hamlet, Michelle R
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Columbia University (N.Y.)
Other Health Professions
Schools of Nursing
New York
United States
Zip Code