Unexpected findings, or incidentalomas, are increasing dramatically with the growth in the use of imaging technology within healthcare organizations. Incidentalomas may indicate significant health problems, such as malignancy in the medium or long term. However, they also may lead to overinvestigation, unnecessary radiation exposure, overtreatment, substantial downstream expenditures, and patient anxiety. Several systematic reviews have explored the prevalence and outcomes of incidentalomas. These studies used inconsistent and often inappropriate synthesis methods, commonly only focusing on one imaging scan or organ in a very limited number of patients. As a result, there is need for large-scale study of incidentalomas that can inform their follow up and guide efforts to optimize health outcomes. To address this need, we propose to build natural language processing (NLP) approaches to identify cancer-related incidentalomas reported in radiology reports (Aim 1) and to create the first large-scale incidentaloma database covering over half-a-million patients (Aim 2). Our research dataset will contain radiology reports, clinical notes containing imaging orders, as well as structured data such as demographic information (e.g., age) and diagnoses codes of patients who received radiologic imaging tests in University of Washington Medical Center (UWMC), Harborview Medical Center (HMC), Seattle Cancer Care Alliance (SCCA), and Northwest Hospital and Medical Center (NWMC) between 2007-2019. Our patient population will be linked to Hutchinson Institute for Cancer Outcomes Research (HICOR) data repository for detailed cancer outcomes and claims data. The created database will be used for clinical and economic analysis of incidentalomas (Aim 3). We will (1) evaluate the concordance between radiologists' documentation of incidentaloma follow-up and established clinical guidelines for thyroid, lung, adrenal, kidney, liver, and pancreas incidentalomas, (2) determine risk of subsequent cancer diagnosis and median survival for each category of incidentaloma, and (3) determine the incremental cost associated with follow-up imaging in patients with incidentalomas. All models and their implementations produced during the execution of this project will be shared with the community as open source. Additionally, the de-identified incidentaloma database will be made available to the research community under a data use agreement. By identifying risk factors for cancer diagnosis and death for common incidental findings, we will be able to provide critical information for future clinical practice guideline development and appropriate use criteria. We assembled a highly interdisciplinary team of experts in NLP, medical informatics, radiology, oncology, health outcomes, and health economics to ensure the successful completion of the proposed project.
Incidentalomas may indicate significant health problems, such as malignancy to the patient in the short or medium term, but also may lead to overtreatment, which comes with substantial downstream expenditures as well as patient anxiety. In this project, we will build natural language processing approaches to extract cancer related incidentalomas reported in radiology reports (Aim 1) and create the first large scale incidentaloma database for half-a-million patients who received radiologic imaging tests in University of Washington Medical Center, Harborview Medical Center, Seattle Cancer Care Alliance, and Northwest Hospital and Medical Center between 2007 and 2019 (Aim 2). This database will be used for clinical and economic analysis of incidentalomas (Aim 3).