Medical and biological data often come in the form of signals, including sequences, and images. In the biomedical setting, microarrays, high-throughput sequencing, protein arrays and many other assays are in widespread use. Similarly, electromagnetic brain imaging techniques (MRI, fMRI and EEG/MEG) are used to study cortical activity in the brain and anatomy. The nature of these data brings major challenges for statistical analysis: specifically the number of measurements is often much larger than the number of cases, and there are correlations among the components. The broad aim of this ongoing three-investigator grant is to develop and study statistical techniques that enhance the analysis and interpretation of these data. Our focus in the new projects is the development of models and methods to extract maximal information from these emerging technologies, and as statisticians, to guide the scientist in valid interpretation of the results. The renewal will address these goal through four Specific Aims. The investigators will study: 1. Post-Selection Inference for comparing internal to external predictors. For genomic and other -omic data, valid statistical comparison of empirical biomarker signatures to standard clinical predictors such as height, weight, and age, using new tools from post-selection inference; 2. Statistical Methods for cancer detection via CAPP-seq. Statistical and computational approaches for determining which contiguous regions (tiles) of the genome should be sequenced, in the search for cancer mutations directed toward earlier cancer detection; 3. New settings for high dimensional Eigen structure in virology and genetics. Eigenvector estimation methods for vaccine design in virology based on mutation sequence data; statistical tools for understanding the distribution of the eigenvalues of large variance component matrices in quantitative genetics by adapting recent advances in statistical random matrix theory; 4. Locally smooth models for MRI data. Improving the sensitivity and resolution of quantitative and diffusion MRI by using models that exploit the spatial structure of the imaging domain. Working together, and with their students, the investigators will implement the new statistical tools into publically available software, following a pattern established in earlier cycles of this grant, in which our packages have found wide use among medical researchers both at Stanford and around the world.

Public Health Relevance

Statistical methods such as those to be developed in this project are essential tools to help medical researchers discover and validate new basic science results (for example in imaging and genomics) that can lead to new therapies. They aid also in the design and analysis of clinical investigations of new treatments so as to use in the most efficient manner the large amount of data collected in current research, while also accurately describing the degree of uncertainty in the conclusions.

Agency
National Institute of Health (NIH)
Institute
National Institute of Biomedical Imaging and Bioengineering (NIBIB)
Type
Research Project (R01)
Project #
5R01EB001988-21
Application #
9145735
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Peng, Grace
Project Start
1996-09-10
Project End
2019-06-30
Budget Start
2016-07-01
Budget End
2017-06-30
Support Year
21
Fiscal Year
2016
Total Cost
$446,175
Indirect Cost
$145,269
Name
Stanford University
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94304
Taylor, Jonathan; Tibshirani, Robert (2018) Post-Selection Inference for ?1-Penalized Likelihood Models. Can J Stat 46:41-61
Donoho, David L; Gavish, Matan; Johnstone, Iain M (2018) Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model. Ann Stat 46:1742-1778
Pataki, Camille I; Rodrigues, João; Zhang, Lichao et al. (2018) Proteomic analysis of monolayer-integrated proteins on lipid droplets identifies amphipathic interfacial ?-helical membrane anchors. Proc Natl Acad Sci U S A 115:E8172-E8180
Johnstone, Iain M (2018) Tail sums of Wishart and Gaussian eigenvalues beyond the bulk edge. Aust N Z J Stat 60:65-74
Johnstone, Iain M; Paul, Debashis (2018) PCA in High Dimensions: An orientation. Proc IEEE Inst Electr Electron Eng 106:1277-1292
Reid, Stephen; Newman, Aaron M; Diehn, Maximilian et al. (2018) Genomic Feature Selection by Coverage Design Optimization. J Appl Stat 45:2658-2676
Powers, Scott; Qian, Junyang; Jung, Kenneth et al. (2018) Some methods for heterogeneous treatment effect estimation in high dimensions. Stat Med 37:1767-1787
Groll, Andreas; Hastie, Trevor; Tutz, Gerhard (2017) Selection of effects in Cox frailty models by regularization methods. Biometrics 73:846-856
Johnstone, I M; Nadler, B (2017) Roy's largest root test under rank-one alternatives. Biometrika 104:181-193
Reid, Stephen; Tibshirani, Robert (2016) Sparse regression and marginal testing using cluster prototypes. Biostatistics 17:364-76

Showing the most recent 10 out of 61 publications