The increasing availability of high-quality digital scanners has enabled the generation of large collections of histology images, confocal/multichannel images, and accompanying metadata. However there is a dearth of robust open- source solutions to efficiently visualize, process and manage these ever growing imaging collections. Our goal is to open-source, document and further develop integrative technologies leveraging our experience with the Cancer Digital Slide Archive (CDSA), a tool we have developed to facilitate analysis of data provided by the NCI's the Cancer Genome Atlas (TCGA). This tool is NOT and will not be limited to the analysis of TCGA data, however by working backwards from public data already available we can ensure the informatics technologies developed are scaleable and usable by the cancer community. In our proposal, we will first go through a process of software engineering review to improve the ease of installation to facilitate distribution to other research groups. We have partnered with Kitware for this proposal allowing us to use their 15+ years of experience in building and maintain quality open source software. The rest of the proposal will focus on the testing and integration of new features such as the ability to perform image quantification (e.g. cell counting, cell profiling), image markup and labeling, as well as perform basic group level analysis allowing the correlation of imaging features with user defined variables of interest. As an example, a user may classify an individual slide based on the mean density of nuclei and correlate this imaging parameter with patient survival, or with tumor grade.

Public Health Relevance

The goal of our proposal is to further develop an open source platform for the management, visualization and analysis of large image repositories focused around cancer imaging, specifically digital histology. The size of these image collections ( gigabytes to terabytes ) require new technologies and platform for the efficient analysis and dissemination of these images, as well as linking these images to other relevant information (patient diagnosis, age, gender, etc). Without such tools, this data will remain largely unavailable for better understanding and characterization of tumor biology.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Resource-Related Research Projects--Cooperative Agreements (U24)
Project #
Application #
Study Section
Special Emphasis Panel (ZCA1-TCRB-9 (J1))
Program Officer
Ossandon, Miguel
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Emory University
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code
Yousefi, Safoora; Amrollahi, Fatemeh; Amgad, Mohamed et al. (2017) Predicting clinical outcomes from large scale cancer genomic profiles with deep survival models. Sci Rep 7:11707
Nalisnik, Michael; Amgad, Mohamed; Lee, Sanghoon et al. (2017) Interactive phenotyping of large-scale histology imaging data with HistomicsML. Sci Rep 7:14588
Wilkinson, S; Hou, Y; Zoine, J T et al. (2017) Coordinated cell motility is regulated by a combination of LKB1 farnesylation and kinase activity. Sci Rep 7:40929
Dunn Jr, William D; Cobb, Jake; Levey, Allan I et al. (2016) REDLetr: Workflow and tools to support the migration of legacy clinical data capture systems to REDCap. Int J Med Inform 93:103-10
Dunn Jr, William D; Gearing, Marla; Park, Yuna et al. (2016) Applicability of digital analysis and imaging technology in neuropathology assessment. Neuropathology 36:270-82
Cooper, Lee A D; Kong, Jun; Gutman, David A et al. (2015) Novel genotype-phenotype associations in human cancers enabled by advanced molecular platforms and computational analysis of whole slide images. Lab Invest 95:366-76
Nalisnik, Michael; Gutman, David A; Kong, Jun et al. (2015) An Interactive Learning Framework for Scalable Classification of Pathology Images. Proc IEEE Int Conf Big Data 2015:928-935
Gutman, David A; Cobb, Jake; Somanna, Dhananjaya et al. (2013) Cancer Digital Slide Archive: an informatics resource to support integrated in silico analysis of TCGA pathology data. J Am Med Inform Assoc 20:1091-8