Medical and biological data often come in the form of digitized signals and images; for example, gene expression microarrays, mass spectrograms, and flow cytometry cell plots. As instrumental data acquisition becomes routine, sequences of such images, signals or paths are collected, often along with other covariate measurements, resulting in datasets where the basic unit of measurement, or response, is a very high-dimensional object. The gene microarray is a leading example of how new technology has led to data acquisition on a massive scale. The project continues to focus on developing techniques for modeling and understanding such data that naturally adapt to the high dimensionality. For studying genomic divergence of bacterial strains using comparative genomic hybridization, we propose latent variable models that incorporate a statistical method called the """"""""fused lasso"""""""", to jointly model the CGH measurements from the bacteria. For flow cytometry analysis of cancer cells, we propose a method for identifying new sub-populations that have emerged after stimulation of the cells. We also propose to develop and study techniques for prediction and clustering for high-dimensional data. Much of this work will be carried out in existing and new collaborations with researchers in medicine and biology, working for example in cancer and auto-immune diseases. Project Narrative: This work can potentially improve the understanding, diagnosis and prognosis of human diseases such as cancer, heart disease and AIDS, and hence can help to improve the overall quality of public health of the U.S. ? ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Institute of Biomedical Imaging and Bioengineering (NIBIB)
Type
Research Project (R01)
Project #
2R01EB001988-12
Application #
7365471
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Peng, Grace
Project Start
1996-09-10
Project End
2011-06-30
Budget Start
2007-09-14
Budget End
2008-06-30
Support Year
12
Fiscal Year
2007
Total Cost
$323,835
Indirect Cost
Name
Stanford University
Department
Miscellaneous
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94305
Taylor, Jonathan; Tibshirani, Robert (2018) Post-Selection Inference for ?1-Penalized Likelihood Models. Can J Stat 46:41-61
Donoho, David L; Gavish, Matan; Johnstone, Iain M (2018) Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model. Ann Stat 46:1742-1778
Pataki, Camille I; Rodrigues, João; Zhang, Lichao et al. (2018) Proteomic analysis of monolayer-integrated proteins on lipid droplets identifies amphipathic interfacial ?-helical membrane anchors. Proc Natl Acad Sci U S A 115:E8172-E8180
Johnstone, Iain M (2018) Tail sums of Wishart and Gaussian eigenvalues beyond the bulk edge. Aust N Z J Stat 60:65-74
Johnstone, Iain M; Paul, Debashis (2018) PCA in High Dimensions: An orientation. Proc IEEE Inst Electr Electron Eng 106:1277-1292
Reid, Stephen; Newman, Aaron M; Diehn, Maximilian et al. (2018) Genomic Feature Selection by Coverage Design Optimization. J Appl Stat 45:2658-2676
Powers, Scott; Qian, Junyang; Jung, Kenneth et al. (2018) Some methods for heterogeneous treatment effect estimation in high dimensions. Stat Med 37:1767-1787
Groll, Andreas; Hastie, Trevor; Tutz, Gerhard (2017) Selection of effects in Cox frailty models by regularization methods. Biometrics 73:846-856
Johnstone, I M; Nadler, B (2017) Roy's largest root test under rank-one alternatives. Biometrika 104:181-193
Reid, Stephen; Tibshirani, Robert (2016) Sparse regression and marginal testing using cluster prototypes. Biostatistics 17:364-76

Showing the most recent 10 out of 61 publications