Early detection of lung cancer among asymptomatic individuals is a priority for reducing mortality of the number one cancer killer worldwide. Most lung cancers are first detected as indeterminate pulmonary nodules (IPNs). While the vast majority of IPNs are benign, those malignant ones present with specific features that should allow for the early discrimination and intervention. We have recently completed a study demonstrating the value of structural imaging features analysis in providing improved accuracy in detection of cancers among IPNs with accuracy of over 90% trained in the NLST and validated in two independent cohorts. The AUC increased from baseline risk estimate of disease using clinical parameters (Mayo model) 0.78 to 0.84 and from 0.82 to 0.92 in two independent validation cohorts. Similarly, we tested the added value of our high sensitivity hsCYFRA 21-1 assay in three populations of lung nodules and obtained similar added value to the MAYO model. Finally, we identified signatures predictive of lung cancer using large scale data mining in the electronic health record (EHR). The performance of the performance of the established imaging predictor, hsCYFRA concentrations and EHR trajectories will be validated in a prospective cohort. In an innovative partnership between pulmonary oncology, radiology, machine learning, and data science experts at Vanderbilt, we propose to integrate the layer of clinical information accessible in the EHR to improve the noninvasive diagnosis accuracy. In addition, we propose to take advantage of repeated measures to improve the accuracy of the prediction of cancer and to reduce the time to diagnosis. We therefore propose the following aims.
In Aim 1 we will validate advanced quantitative imaging analyses to distinguish early benign from malignant IPNs based on repeated measures of 1000 individuals.
In Aim 2. We will test in 150 individuals with lung nodules the added value of repeated measures of hsCYFRA 21- 1 protein blood biomarker in diagnostic accuracy over the baseline concentrations of the biomarker.
In Aim 3 we will test a deep learning strategy from the EHR of 20,000 patients from VUMC to identify patterns likely to improve the early detection of lung cancer, and in Aim 4 we will test the added value of monitoring changes in levels of the markers for early detection using repeated pre-diagnosis chest CT studies, serum analysis of hsCYFRA 21- 1, and EHR patterns from our lung cancer screening program. Built upon strong preliminary data and unique resources from VUMC that include access to large imaging and HER data sources this novel integrative study has the potential to generate highly impactful and translatable results to reduce false positive rates among IPNs, and morbidity and mortality from lung cancer. This application responds to PAR 19-264 using low-dose lung screening computed tomography longitudinal analysis integrated with a lead serum biomarker and the power of artificial intelligence to mine the EHR for the discovery of a novel integrative strategy for the early detection of premetastatic lung cancer.

Public Health Relevance

Most lung cancers present as pulmonary nodules; so, to improve survival from lung cancer, we need to identify with noninvasive strategies the disease early among a majority of benign nodules. This project focuses on establishing an integrative approach to early detection leveraging repeated measures of chest CT imaging features of lung nodules over time, of the rate of change of a high sensitivity blood biomarker hsCYFRA 21-1, and of longitudinal clinical patterns from the electronic health record. This research will inform the management of lung nodules identified on lung cancer screening CT scans, and should reduce the time to diagnosis and the mortality of lung cancer.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Mazurchuk, Richard V
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Vanderbilt University Medical Center
United States
Zip Code