Molecular signature-guided clinical therapies are critical to advancing the treatment of cancer, and there has been a recent explosion in the number of types of molecular data that can potentially be used to identify mutations, expression levels, and methylations (and combinations of these effects) that contribute to cancer gene functioning. Vast stores of such data are now publicly available in repositories, like The Cancer Genome Atlas Projects and the International Cancer Genome Consortium, where they await statistical analyses. Like finding a needle in a haystack, the central problem that arises in the analyses of these data is the problem of identifying important prognostic factors from huge numbers of non-prognostic factors. The investigators of this project have recently developed a new method that can accomplish this feat. Their approach has proven to correctly identify important factors that predict outcomes when there are many more factors that can be used for prediction than there are observations of an outcome, and recent theoretical developments and simulation studies have demonstrated that these results can be extended to situations in which there are many, many more possible gene expression values than there are tissue samples from cancer patients. The goal of this project is to extend these methods so that they can be applied to broader classes of patient outcome data, to make these methods more computationally efficient so that they can be applied routinely to massive genomic datasets, to apply these methods to existing cancer studies, and to incorporate these new methods into software tools that can be distributed to cancer researchers throughout the world so that they can more effectively identify genetic mutations that are either associated with cancer functioning or predictive of the success of new or existing cancer therapies.
Model selection procedures are statistical techniques that allow researchers to discover the associations between disease and the large number of variables that are measured in emerging high-throughput screening technologies. For example, model selection techniques are used to discover which genes are associated with particular forms of cancer. This project proposes a new class of model selection procedures that will make it easier for researchers to discover such associations.
Shin, Minsuk; Bhattacharya, Anirban; Johnson, Valen E (2018) Scalable Bayesian Variable Selection Using Nonlocal Prior Densities in Ultrahigh-dimensional Settings. Stat Sin 28:1053-1078 |
Rossell, David; Telesca, Donatello (2017) NON-LOCAL PRIORS FOR HIGH-DIMENSIONAL ESTIMATION. J Am Stat Assoc 112:254-265 |
Papaspiliopoulos, O; Rossell, D (2017) Bayesian block-diagonal variable selection and model averaging. Biometrika 104:343-359 |
Johnson, Valen E; Payne, Richard D; Wang, Tianying et al. (2017) On the Reproducibility of Psychological Science. J Am Stat Assoc 112:1-10 |
Liu, Suyu; Johnson, Valen E (2016) A robust Bayesian dose-finding design for phase I/II clinical trials. Biostatistics 17:249-63 |
Nikooienejad, Amir; Wang, Wenyi; Johnson, Valen E (2016) Bayesian variable selection for binary outcomes in high-dimensional genomic studies using non-local priors. Bioinformatics 32:1338-45 |
Hu, Jianhua; Zhu, Hongjian; Hu, Feifang (2015) A Unified Family of Covariate-Adjusted Response-Adaptive Designs Based on Efficiency and Ethics. J Am Stat Assoc 110:357-367 |
Yajima, Masanao; Telesca, Donatello; Ji, Yuan et al. (2015) Detecting differential patterns of interaction in molecular pathways. Biostatistics 16:240-51 |
Rossell, David (2015) BIG DATA AND STATISTICS: A STATISTICIAN'S PERSPECTIVE. Metode Sci Stud J 5:143-149 |
Stephan-Otto Attolini, Camille; Peña, Victor; Rossell, David (2015) Designing alternative splicing RNA-seq studies. Beyond generic guidelines. Bioinformatics 31:3631-7 |
Showing the most recent 10 out of 23 publications