Genomic profiling has become a routine practice in selecting treatments for many diseases, enabling the classification of patients into categories that associate with improved outcomes for specific treatments. One potential detractor to this approach is the tremendous heterogeneity in tissues used for profiling. Genomic classifications, obtained from a relatively small biopsy, are subject to influence from broad, regional variations in the affected tissue. Heterogeneity on a cellular scale can also obscure the target of treatment, as cells with distinct molecular profiles are homogenized in genomic profiling. Realizing better therapies will depend greatly on the ability to understand molecular heterogeneity within an individual, a challenge that necessitates new approaches to organize, analyze and integrate data from multiple spatial and molecular scales. This proposal describes an informatics framework to characterizing heterogeneity for tissue based studies. The framework will combine imaging informatics with genomics to describe molecular heterogeneity at multiple spatial and molecular scales. The imaging component will leverage a novel quantum dot technology that enables detailed mapping of multiple protein expression pathways within a single sample. Fluorescence in situ hybridization imaging will be used to measure DNA content. Whole-slide digitization will enable computer algorithms to capture molecular profiles of hundreds of millions of cells, calculating quantitative features to describe their expression patterns and DNA content. Biologically meaningful descriptions of each cell will be generated using a novel active machine learning classifier to annotate cells with an ontology describing molecular biology and cell anatomy, enabling slides to be analyzed in a biological context. Cell boundaries, features, and annotations will be integrated through the Pathology Analytic Imaging Standards (PAIS) database to provide support for data mining analysis. Mining methods will be developed to find the enrichment of cellular phenotypes, and to analyze the spatial layout of cells with respect to structures like blood vessels to discover the influence of the tissue microenvironment on key expression pathways in surrounding cells. These tools will be applied to studies of glioblastoma brain tumors, but are relevant for studies of other solid tissue diseases. The scientific study wil use tissues resected in a novel clinical trial that accurately defines the invading tumor margin, bulk and necrosis-rich core. Tissues will be analyzed for gene expression and imaging to generate a paired genomic-imaging profile for each region. Mining the imaging and gene expression profiles of these regions will identify intra-tumoral differences in cellular phenotypes and illustrate the extent of variation in genomic classifications. The paired imaging and gene expression profiles will also be mined to determine relationships between specific expression classes and the imaging observations to illustrate a complete picture of heterogeneity. A project repository will be deployed to disseminate images, analysis pipelines and analytic results. This repository will provide a public resource for brain tumor research and access to open source tools.

Public Health Relevance

Developing effective treatments for disease requires an understanding of their molecular mechanisms. The software tools created by this research will enable researchers to better identify variations in the mechanisms of disease within an individual, and to develop and apply more effective therapies to improve patient outcomes.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Career Transition Award (K22)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Emory University
Schools of Medicine
United States
Zip Code
Mobadersany, Pooya; Yousefi, Safoora; Amgad, Mohamed et al. (2018) Predicting cancer outcomes from histology and genomics using convolutional networks. Proc Natl Acad Sci U S A 115:E2970-E2979
Halani, Sameer H; Yousefi, Safoora; Vega, Jose Velazquez et al. (2018) Multi-faceted computational assessment of risk and progression in oligodendroglioma implicates NOTCH and PI3K pathways. NPJ Precis Oncol 2:24
Yousefi, Safoora; Amrollahi, Fatemeh; Amgad, Mohamed et al. (2017) Predicting clinical outcomes from large scale cancer genomic profiles with deep survival models. Sci Rep 7:11707
Wilkinson, S; Hou, Y; Zoine, J T et al. (2017) Coordinated cell motility is regulated by a combination of LKB1 farnesylation and kinase activity. Sci Rep 7:40929
Nalisnik, Michael; Amgad, Mohamed; Lee, Sanghoon et al. (2017) Interactive phenotyping of large-scale histology imaging data with HistomicsML. Sci Rep 7:14588
Nalisnik, Michael; Gutman, David A; Kong, Jun et al. (2015) An Interactive Learning Framework for Scalable Classification of Pathology Images. Proc IEEE Int Conf Big Data 2015:928-935
Cooper, Lee A D; Kong, Jun; Gutman, David A et al. (2015) Novel genotype-phenotype associations in human cancers enabled by advanced molecular platforms and computational analysis of whole slide images. Lab Invest 95:366-76
Gutman, David A; Cobb, Jake; Somanna, Dhananjaya et al. (2013) Cancer Digital Slide Archive: an informatics resource to support integrated in silico analysis of TCGA pathology data. J Am Med Inform Assoc 20:1091-8
Cooper, Lee A D; Carter, Alexis B; Farris, Alton B et al. (2012) Digital Pathology: Data-Intensive Frontier in Medical Imaging: Health-information sharing, specifically of digital pathology, is the subject of this paper which discusses how sharing the rich images in pathology can stretch the capabilities of all otherwi Proc IEEE Inst Electr Electron Eng 100:991-1003