Bayesian Models for Gene Expression with Microarray Data

Mallick, Bani

Abstract

This project is concerned with parametric and semiparametric modeling of gene expression data. DNA microarrays and other high-throughput methods for analyzing complex nucleic acid sequences now make it possible to rapidly, efficiently and accurately measure the levels of many genes expressed in a biological sample. The main difficulty with microarray data analysis is that the sample size is very small when compared to the dimension of the problem (the number of genes). The number of genes for a single individual is usually in the thousands and there are few individuals in the data set. We propose several novel parametric Bayesian modeling approaches for gene selection, tumor classification, Bayesian networks, gene clustering and dimension reduction methods. Most of the existing methods are not model-based and thus are unable to address specific questions regarding formal assessment of uncertainties or assessment of the fit of a specific model. Also model-based approaches offer the potential for extension to more complex situations, e.g., probabilistic mixture modeling, handling missing data, etc. We will develop Bayesian hierarchical models for microarray data, which will accommodate several modeling factors flexibly at different levels. In several of the modeling frameworks, we will keep the dimension of the model space unknown to create added flexibility. It is impossible to get analytical answers in these flexible classes of models so simulation based Markov Chain Monte Carlo (MCMC) methodology with dimensional jumping algorithms will be used to derive the estimates (uncertainty distributions) of the unknown parameters.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Research Project (R01)
Project #: 5R01CA104620-02
Application #: 7075306
Study Section: Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer: Jacobson, James W

Project Start: 2005-06-10
Project End: 2008-05-31
Budget Start: 2006-06-01
Budget End: 2007-05-31
Support Year: 2
Fiscal Year: 2006
Total Cost: $280,610
Indirect Cost

Institution

Name: Texas A&M University
Department: Biostatistics & Other Math Sci
Type: Schools of Arts and Sciences
DUNS #: 078592789

City: College Station
State: TX
Country: United States
Zip Code: 77845

Related projects


NIH 2007 R01 CA	Bayesian Models for Gene Expression with Microarray Data Mallick, Bani K. / Texas A&M University	$272,472
NIH 2006 R01 CA	Bayesian Models for Gene Expression with Microarray Data Mallick, Bani K. / Texas A&M University	$280,610
NIH 2005 R01 CA	Bayesian Models for Gene Expression with Microarray Data Mallick, Bani K. / Texas A&M University	$287,363

Publications

Sinha, Samiran; Mallick, Bani K; Kipnis, Victor et al. (2010) Semiparametric bayesian analysis of nutritional epidemiology data in the presence of measurement error. Biometrics 66:444-54

Vahedi, Golnaz; Faryabi, Babak; Chamberland, Jean-Francois et al. (2009) Optimal intervention strategies for cyclic therapeutic methods. IEEE Trans Biomed Eng 56:281-91

Chen, Yi-Hau; Chatterjee, Nilanjan; Carroll, Raymond J (2009) Shrinkage Estimators for Robust and Efficient Inference in Haplotype-Based Case-Control Studies. J Am Stat Assoc 104:220-233

Apanasovich, Tatiyana V; Carroll, Raymond J; Maity, Arnab (2009) SIMEX and standard error estimation in semiparametric measurement error models. Electron J Stat 3:318-348

Wang, Yuedong; Ma, Yanyuan; Carroll, Raymond J (2009) Variance estimation in the analysis of microarray data. J R Stat Soc Series B Stat Methodol 71:425-445

Vanamala, J; Glagolenko, A; Yang, P et al. (2008) Dietary fish oil and pectin enhance colonocyte apoptosis in part through suppression of PPARdelta/PGE2 and elevation of PGE3. Carcinogenesis 29:790-6

Apanasovich, Tatiyana V; Ruppert, David; Lupton, Joanne R et al. (2008) Aberrant crypt foci and semiparametric modeling of correlated binary data. Biometrics 64:490-500

Cheon, Sooyoung; Liang, Faming (2008) Phylogenetic tree construction using sequential stochastic approximation Monte Carlo. Biosystems 91:94-107

Faryabi, B; Datta, A; Dougherty, E R (2007) On approximate stochastic control in genetic regulatory networks. IET Syst Biol 1:361-8

Gold, David L; Coombes, Kevin R; Wang, Jing et al. (2007) Enrichment analysis in high-throughput genomics - accounting for dependency in the NULL. Brief Bioinform 8:71-7

Showing the most recent 10 out of 11 publications

Comments

Be the first to comment on Bani Mallick's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: