In spite of significant developments in both our knowledge of cancer genomics and advances in techniques able to capture genomic information across the genome, epigenome, and transcriptome, numerous challenges still exist, slowing the discovery and translation of findings from "bench to bedside". One of the biggest challenges in the effective treatment of many cancers is the large observed inter-patient variation in clinical response observed for many of the commonly used therapies. In the treatment of epithelial ovarian cancer (EOC), the standard therapy for patients with advanced disease is initial debulking surgery followed by carboplatin-paclitaxel combination chemotherapy. Unfortunately, even with modern chemotherapy, most patients with advanced disease relapse and die of EOC. One approach to overcome the heterogeneity seen in treatment response has been the use of molecular profiling or clustering to determine molecular based tumor subtypes. Within these subtypes, one hypothesis is that the tumors will be more homogeneous and thus may have similar clinical response to a given therapy regimen. Traditionally, molecular profiling has been based on a single data type, usually gene expression data, or the layering of results from the clustering of each data type individually. However, there have been limited a number of studies and methods proposed using an integrative clustering approach. As clinical outcome to cancer therapies is most likely not due to a single gene or data type, but rather a complex process involving genetic variation, somatic mutations, mRNA, miRNA, DNA methylation, etc., the use of all available genomic information in the determination of clinically relevant molecular subtypes is essential. Therefore, we propose to develop a Bayesian Integrative Molecular Clustering (BIMC) approach to determine genomic profiles which will incorporate not only gene expression data, but also other sources of genomic information, such as DNA methylation, somatic mutations and germline genetic variants (e.g., BRCA1 and BRCA2). In addition to the incorporation of multiple types of genomic data into the development of the molecular subtypes, we will also incorporate available clinical information using a semi-supervised step to determine not only molecular profiles, but clinically relevant molecular profiles. Following the development of BIMC, simulation studies will be completed to compare the proposed modeling framework to existing approaches for clustering based on single and multiple genomic data types. We will also assess the utility of this proposed method by applying the developed modeling framework to the existing data on a set of EOC cases from the Cancer Genome Atlas (TCGA). Following the identification of a profile that is able to determine statistically significant, clinically relevant, molecular subtypes, we will replicate thi molecular profile in an independent EOC study conducted at the Mayo Clinic. This research will result in the development of an integrative clustering framework, BIMC, and software for molecular subtyping which will aid in the discovery of novel cancer loci implicated in disease etiology and/or clinical outcomes.

Public Health Relevance

Two challenges in cancer genomics is tumor heterogeneity and the efficient identification of relevant genomic variation important in disease etiology and response to cancer therapies. One approach to overcome these challenges is molecular clustering to determine clinically relevant cancer subtypes. Therefore, we propose to develop an innovative Bayesian Integrative Molecular Clustering (BIMC) approach and software to enable the determination of profiles based on multiple sources of molecular and genomic information, along with available clinical information.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Exploratory/Developmental Grants (R21)
Project #
Application #
Study Section
Special Emphasis Panel (ZCA1-RPRB-7 (O1))
Program Officer
Ossandon, Miguel
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Kansas
Biostatistics & Other Math Sci
Schools of Medicine
Kansas City
United States
Zip Code
Chalise, Prabhakar; Koestler, Devin C; Bimali, Milan et al. (2014) Integrative clustering methods for high-dimensional molecular data. Transl Cancer Res 3:202-216