This proposal is a collaboration with the HTAN Data Coordination Center DCC and describes an Image Data Project aimed at developing and deploying the technology needed for storage, distribution and basic analysis of cell and tissue images collected by multiple HTAN Centers. Multiplexed tissue images are an important type of data for nearly all of the centers contributing to the HTAN (second only to single cell sequencing data in number of centers collecting data). However, the software needed to visualize, analyze, manage, and share multiplexed images of tissues and tumors is underdeveloped. The initial availability of SARDANA images has highlighted the challenges faced by HTAN, including the DCC, in deploying an infrastructure for distributing large and complex images. We therefore propose a two-year HTAN Image Data Project (IDP) led by the DCC and HMS PCA focused on the rapid development and deployment of image informatic systems and computational resources for image management and analysis. Our goal is to put in place a functional first-generation system no later than summer 2020 and to then steadily refine the system so that it becomes the backbone of cross-functional HTAN atlases. As a matter of necessity, we will start with informatic systems and software that are either available today or in a relatively advanced state of development. However, we expect to evaluate these choices throughout the IDP and change course as necessary to incorporate potentially superior approaches. We will also support the diverse needs and formats of centers using different data collection methods.
Aim 1 will focus on the deployment and progressive improvement of a cloud-based database for image management based on the OMERO standard as well as a parallel system for access to primary data.
Aim 2 will develop and deploy software for visualizing HTAN image data by the general public. The IDP will use the existing MCWG and DAWG mechanisms for oversight and reporting, and all centers will be invited to participate. Within IDP, the HMS PCA will take primary responsibility for initial deployment of image informatics software. The DCC and HMS will jointly undertake software development and code hardening, and the DCC will take the lead in user assistance and software deployment, particularly in year two.
Images of tumor specimens obtained from biopsy or surgery are one of the primary ways in which cancer is diagnosed and staged by pathologists, but such images have typically lacked molecular detail. The highly multiplexed tissue images being collected by HTAN will fundamentally change this, and it is therefore essential that the data be efficiently and widely distributed. The HTAN Image Data Project IDP will address an acute need for software for data dissemination and visualization.