In cancer research, profiling studies have been extensively conducted, measuring genome-wide gene expression levels, DNA modifications, epigenetic regulation, and post-transcriptional regulations. Many studies are "one-dimensional" and restricted to one type of genomic measurement. More recently, "multi-dimensional" studies are becoming more popular. In such studies, the same samples are profiled on multiple layers of genomic activities. A representative example is The Cancer Genome Atlas (TCGA). Multi-dimensional studies offer a unique opportunity to more comprehensively describe the etiology and prognosis of cancer. In the literature, much effort has been devoted to modeling the interconnections among different regulations. In contrast, there are relatively few studies conducting integrated analysis and modeling the associations between multiple types of genomic measurements and cancer outcomes. The existing integrated analysis methods also have serious limitations, which may lead to suboptimal or even biased results. Our goal is to more effectively describe cancer etiology and prognosis by analyzing multi-dimensional genomic data. Motivated by the limitations of existing studies, our first objective is to develop novel statistical methods, effectively integrate multi-dimensional genomic measurements, and establish their associations with cancer outcomes. Such an objective differs significantly from those of published studies. The proposed methods will have significant advantages. They will assume different biological working models, allowing for a direct comparison of these models. They will be applicable to a large number of datasets, can accommodate the joint effects of a large number of markers, and adopt efficient statistical techniques. The second objective is to apply these methods and analyze TCGA data on multiple types of cancers.
The specific aims are to (Aim 1) Develop novel statistical methods to integrate multiple types of genomic measurements for cancer outcomes. Three different methods will be developed under different data generating models;
(Aim 2) Develop user- friendly software and project website. Analyze TCGA data on multiple types of cancers, particularly including cancers of breast, ovary and prostate and lymphoma. Such data have measurements on gene expression, copy number variation, methylation, microRNA and others available. With the cost of sequencing falling fast, it will soon become a routine practice to profile multi- dimensional genomic characterizations of samples. This study will deliver a new analysis strategy and a set of novel statistical methods. These methods will integrate multiple types of genomic measurements for cancer outcomes and complement the existing methods. The analysis of TCGA data will provide valuable insights into multiple cancers and serve as prototype for future applications.
Novel statistical methods will be developed to analyze cancer studies with multiple types of genomic measurements. Three methods will be developed to associate genomic measurements with cancer outcomes under different model assumptions. TCGA data on multiple types of cancers will be analyzed.
|Zhao, Qing; Shi, Xingjie; Xie, Yang et al. (2015) Combining multidimensional genomic measurements for predicting cancer prognosis: observations from TCGA. Brief Bioinform 16:291-303|
|Wang, Yu; Ma, Shuangge (2014) Risk factors for etiology and prognosis of mantle cell lymphoma. Expert Rev Hematol 7:233-43|
|Wu, Cen; Cui, Yuehua; Ma, Shuangge (2014) Integrative analysis of gene-environment interactions under a multi-response partially linear varying coefficient model. Stat Med 33:4988-98|
|Zhu, Ruoqing; Zhao, Hongyu; Ma, Shuangge (2014) Identifying gene-environment and gene-gene interactions using a progressive penalization approach. Genet Epidemiol 38:353-68|