The overall objective of this research proposal is to develop new Bayesian methodologies for the analysis of data that arise in genomics. Of particular interest are situations where a large number of variables is available and selection of a predictive subset is one of the goals. The theoretical developments we propose are motivated by a variety of studies, some conducted by our biomedical collaborators, using DNA microarray technologies. One of the goals of this project is to contribute novel theoretical developments in variable and feature selection in statistics. Another goal is to provide the biomedical community with sound methods for the analysis of high-dimensional data. The identification of important biomarkers will provide a better understanding of the molecular mechanisms involved in specific diseases, and will in turn improve diagnosis, drug development, and treatment of patients.
The specific aims of our proposed research are: 1. Clustering of High-Dimensional Data: We will develop novel Bayesian methods for simultaneously clustering experimental units and identifying the variables that best discriminate the different groups. 2. Analysis of High-Dimensional Data with Censored Survival Outcomes: We will investigate novel methods for variable selection in parametric survival models. The methods will lead to estimates of the survival and to the identification of the predictive variables. 3. Application to Microarray Studies: We will apply the methods of Specific Aims #1 and #2 to a series of biomedical studies involving microarray data. These include studies on rheumatoid arthritis and osteoarthritis and adult acute lymphobiastic leukemia. 4. Application to Proteomic Data: We will adapt our methodologies to the problem of extracting important features in proteomics data, incorporating dimension reduction wavelet techniques. 5. Software development: We will develop statistical software and will make it available to the public.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG003319-03
Application #
7218031
Study Section
Special Emphasis Panel (ZRG1-BDMA (01))
Program Officer
Struewing, Jeffery P
Project Start
2005-04-01
Project End
2007-09-01
Budget Start
2007-04-01
Budget End
2007-09-01
Support Year
3
Fiscal Year
2007
Total Cost
$49,406
Indirect Cost
Name
Texas A&M University
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
078592789
City
College Station
State
TX
Country
United States
Zip Code
77845
Stingo, Francesco C; Chen, Yian A; Tadesse, Mahlet G et al. (2011) INCORPORATING BIOLOGICAL INFORMATION INTO LINEAR MODELS: A BAYESIAN APPROACH TO THE SELECTION OF PATHWAYS AND GENES. Ann Appl Stat 5:1978-2002
Savitsky, Terrance; Vannucci, Marina; Sha, Naijun (2011) Variable Selection for Nonparametric Gaussian Process Priors: Models and Computational Strategies. Stat Sci 26:130-149
Stingo, Francesco C; Vannucci, Marina (2011) Variable selection for discriminant analysis with Markov random field priors for the analysis of microarray data. Bioinformatics 27:495-501
Trevino, Victor; Tadesse, Mahlet G; Vannucci, Marina et al. (2011) Analysis of normal-tumour tissue interaction in tumours: prediction of prostate cancer features from the molecular profile of adjacent normal cells. PLoS One 6:e16492
Preter, M; Lee, S H; Petkova, E et al. (2011) Controlled cross-over study in normal subjects of naloxone-preceding-lactate infusions; respiratory and subjective responses: relationship to endogenous opioid system, suffocation false alarm theory and childhood parental loss. Psychol Med 41:385-93
Stingo, Francesco C; Chen, Yian A; Vannucci, Marina et al. (2010) A BAYESIAN GRAPHICAL MODELING APPROACH TO MICRORNA REGULATORY NETWORK INFERENCE. Ann Appl Stat 4:2024-2048
Zhu, Hongxiao; Vannucci, Marina; Cox, Dennis D (2010) A bayesian hierarchical model for classification with selection of functional predictors. Biometrics 66:463-73
Savitsky, Terrance; Vannucci, Marina (2010) Spiked Dirichlet Process Priors for Gaussian Process Models. J Probab Stat 2010:201489
Koshelev, Misha; Lohrenz, Terry; Vannucci, Marina et al. (2010) Biosensor approach to psychopathology classification. PLoS Comput Biol 6:e1000966
Lennox, Kristin P; Dahl, David B; Vannucci, Marina et al. (2010) A DIRICHLET PROCESS MIXTURE OF HIDDEN MARKOV MODELS FOR PROTEIN STRUCTURE PREDICTION. Ann Appl Stat 4:916-942

Showing the most recent 10 out of 21 publications