Consistent variable selection in p>>n settings

Johnson, Valen

Abstract

Molecular signature-guided clinical therapies are critical to advancing the treatment of cancer, and there has been a recent explosion in the number of types of molecular data that can potentially be used to identify mutations, expression levels, and methylations (and combinations of these effects) that contribute to cancer gene functioning. Vast stores of such data are now publicly available in repositories, like The Cancer Genome Atlas Projects and the International Cancer Genome Consortium, where they await statistical analyses. Like finding a needle in a haystack, the central problem that arises in the analyses of these data is the problem of identifying important prognostic factors from huge numbers of non-prognostic factors. The investigators of this project have recently developed a new method that can accomplish this feat. Their approach has proven to correctly identify important factors that predict outcomes when there are many more factors that can be used for prediction than there are observations of an outcome, and recent theoretical developments and simulation studies have demonstrated that these results can be extended to situations in which there are many, many more possible gene expression values than there are tissue samples from cancer patients. The goal of this project is to extend these methods so that they can be applied to broader classes of patient outcome data, to make these methods more computationally efficient so that they can be applied routinely to massive genomic datasets, to apply these methods to existing cancer studies, and to incorporate these new methods into software tools that can be distributed to cancer researchers throughout the world so that they can more effectively identify genetic mutations that are either associated with cancer functioning or predictive of the success of new or existing cancer therapies.

Public Health Relevance

Model selection procedures are statistical techniques that allow researchers to discover the associations between disease and the large number of variables that are measured in emerging high-throughput screening technologies. For example, model selection techniques are used to discover which genes are associated with particular forms of cancer. This project proposes a new class of model selection procedures that will make it easier for researchers to discover such associations.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Research Project (R01)
Project #: 5R01CA158113-07
Application #: 9534527
Study Section: Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer: Chen, Huann-Sheng

Project Start: 2011-04-01
Project End: 2021-07-31
Budget Start: 2018-08-01
Budget End: 2019-07-31
Support Year: 7
Fiscal Year: 2018
Total Cost
Indirect Cost

Institution

Name: Texas A&M University
Department: Biostatistics & Other Math Sci
Type: Schools of Arts and Sciences
DUNS #: 020271826

City: College Station
State: TX
Country: United States
Zip Code: 77845

Related projects


NIH 2020 R01 CA	Consistent variable selection in p>>n settings Johnson, Valen Earl / Texas A&M University
NIH 2019 R01 CA	Consistent variable selection in p>>n settings Johnson, Valen Earl / Texas A&M University
NIH 2018 R01 CA	Consistent variable selection in p>>n settings Johnson, Valen Earl / Texas A&M University
NIH 2017 R01 CA	Consistent variable selection in p>>n settings Johnson, Valen Earl / Texas A&M University
NIH 2016 R01 CA	Consistent variable selection in p>>n settings Johnson, Valen Earl / Texas A&M University	$333,201
NIH 2014 R01 CA	Consistent model selection in the p>>n setting Johnson, Valen Earl / Texas A&M University	$290,219
NIH 2013 R01 CA	Consistent model selection in the p>>n setting Johnson, Valen Earl / Texas A&M University	$280,925
NIH 2012 R01 CA	Consistent model selection in the p>>n setting Johnson, Valen Earl / University of Texas MD Anderson Cancer Center	$291,267
NIH 2011 R01 CA	Consistent model selection in the p>>n setting Johnson, Valen Earl / University of Texas MD Anderson Cancer Center	$310,258

Publications

Shin, Minsuk; Bhattacharya, Anirban; Johnson, Valen E (2018) Scalable Bayesian Variable Selection Using Nonlocal Prior Densities in Ultrahigh-dimensional Settings. Stat Sin 28:1053-1078

Rossell, David; Telesca, Donatello (2017) NON-LOCAL PRIORS FOR HIGH-DIMENSIONAL ESTIMATION. J Am Stat Assoc 112:254-265

Papaspiliopoulos, O; Rossell, D (2017) Bayesian block-diagonal variable selection and model averaging. Biometrika 104:343-359

Johnson, Valen E; Payne, Richard D; Wang, Tianying et al. (2017) On the Reproducibility of Psychological Science. J Am Stat Assoc 112:1-10

Liu, Suyu; Johnson, Valen E (2016) A robust Bayesian dose-finding design for phase I/II clinical trials. Biostatistics 17:249-63

Nikooienejad, Amir; Wang, Wenyi; Johnson, Valen E (2016) Bayesian variable selection for binary outcomes in high-dimensional genomic studies using non-local priors. Bioinformatics 32:1338-45

Wang, Yuan; Hobbs, Brian P; Hu, Jianhua et al. (2015) Predictive classification of correlated targets with application to detection of metastatic cancer using functional CT imaging. Biometrics 71:792-802

Hu, Jianhua; Zhu, Hongjian; Hu, Feifang (2015) A Unified Family of Covariate-Adjusted Response-Adaptive Designs Based on Efficiency and Ethics. J Am Stat Assoc 110:357-367

Yajima, Masanao; Telesca, Donatello; Ji, Yuan et al. (2015) Detecting differential patterns of interaction in molecular pathways. Biostatistics 16:240-51

Rossell, David (2015) BIG DATA AND STATISTICS: A STATISTICIAN'S PERSPECTIVE. Metode Sci Stud J 5:143-149

Showing the most recent 10 out of 23 publications

Comments

Be the first to comment on Valen Johnson's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: