Accurate biomarker-driven prognostic stratification, response prediction, and cohort enrichment are critical for realizing precision treatment strategies and population health management approaches that optimize quality of life and survival for cancer patients. Genomics holds promise for improving classification and prognostication of malignancies, yet oncology practice continues to rely heavily on immunohistochemistry (IHC) as a fundamental tool due to its practicality and ability to provide protein-level and subcellular localization information. The goal of this proposal is to create an open-source software resource for the quantitative analysis of IHC stained tissues and effective integration of IHC, genomic, and clinical features for cancer classification and prognostication. This proposal builds on our collective experience in computer-assisted analysis of microscopic images (including IHC images), development of machine-learning methods to address the challenges of classification and prognostication with heterogeneous and high-dimensional data, and leadership in collection and large-scale analysis of cancer outcomes involving collaboration with multiple medical centers. This effort for the first time will create tools to integrate quantitative IHC imaging, clinical, and genomic information that will in turn enable the research community to explore strategies for the classification of malignancies and prediction of outcomes. The proposed tools will be developed and extensively validated in close collaboration with clinical, genomic, and digital pathology data from the NCI-supported Lymphoma Epidemiology of Outcomes (LEO) cohort study. The software tools produced by this proposal will enable the characterization of subcellular protein expression in cell nuclei, membranes and cytoplasmic compartments. Spatial features of protein expression heterogeneity, along with patient-level summaries of protein expression will be used to develop machine-learning classifiers for cancer subtypes, using diffuse large b-cell lymphomas as a driving application. Technology for automatic tuning of machine learning algorithms will enable a broad class of clinically and biologically motivated users to utilize these tools in their investigations. We will also provide an interactive dashboard that enables users to integrate genomic and IHC-based features to explore prognostic models of patient survival. These tools will be released and documented under an open-source model, integrated with HistomicsTK (, and available to the broader cancer research community.

Public Health Relevance

Classification of a patient?s cancer is critical for personalizing their therapy, and in many cases involves the pathologic examination of tissues treated with multiple immunohistochemical stains. This process is highly subjective, leading to problems with reliable classification and suboptimal treatment and this proposal is focused on developing software tools to improve the reliability of cancer diagnosis and prognosis. These tools will provide a reliable means of interpreting information derived from a practice known as immunohistochemistry to improve the consistency of diagnosis and patient outcomes.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project--Cooperative Agreements (U01)
Project #
Application #
Study Section
Special Emphasis Panel (ZCA1)
Program Officer
Divi, Rao L
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Emory University
Biomedical Engineering
Schools of Medicine
United States
Zip Code