Cancers are often highly heterogeneous with many different subtypes. These subtypes confer different outcomes including prognosis, response to treatments, recurrence, and metastasis. In addition, these subtypes are often associated with different genetic mutations, epigenetic events, gene expression profiles, molecular signatures, tissue and organ morphologies, and clinical phenotypes. Effective treatment requires a personalized characterization of genetic, molecular, and clinical biomarkers. Integrative genomics, where multiple data modalities are used to jointly stratify the patients into subtypes, holds the clear promise for enhancing the prediction of differential clinical outcomes and enabling personalized treatment schemes. Our primary goal is to develop an informatics platform enabling the discovery of integrative biomarkers (including multiple phenotypic and genomic data sources) that can effectively stratify patients into subtypes. Specifically we focus on integrating histological image data with other data modalities. Histological image data obtained from tumor samples provide critical information regarding the organizational and morphological features of the tumor which are used by pathologists to make diagnoses such as grading and staging. In addition, these cellular level morphological features are manifestations of molecular events and genomic characteristics of the cells, which are measured in genomic data. Therefor morphological features provide an important bridge between the clinical phenotypes and genomics data. However, the wide adoption of imaging data in cancer studies is challenged by the large data size and complex algorithms. To address this, we propose to develop an informatics system which enables integrative genomics with a focus on imaging genomics. The resulting software will be open source and freely available to research communities. We plan to achieve our goals via three specific aims. First, we will develop software libraries for integrating genomic data, histological images, and clinical data for cancer biomarker discovery and subtyping. Second, we will integrate the imaging analysis and data integration algorithms as well as data visualization tools into a high throughput data management system previously developed at OSU such that the biomedical researchers and clinicians can retrieve data and carry out such analysis without the need for repeatedly implementing complex systems. Finally, we will test the software by applying it on multiple different cancer studies for further evaluation. The system will be designed based on principles of open source software and will be disseminated to the research communities freely.
Integrative genomics is a new and emerging approach for studying cancer by integrating data obtained from all modalities including genomics, molecular, imaging, and the clinic. While it has been demonstrated that integrative genomics can lead to new discoveries in cancer subtyping and patient stratification, it is computationally challenging to carry out such studies. In this project, we propose to develop a novel set of software tools that will enable researchers and clinicians to integrate a collection of data. Such tools are likel to foster the development of personalized treatment schemes for cancer patients. This software will be applied to multiple cancer studies.