Transcriptional regulation is one of the crucial mechanisms used by living systems to regulate protein levels. Disregulation of gene expression underlies toxic effects of many chemicals, and gene expression changes are often reliable markers of a disease. Understanding of gene expression regulation mechanisms is likely to improve our ability to effectively treat human disease and predict effects of environmental toxicants. Identifying groups of co-expressed genes by the cluster analysis of microarrays data has been a commonly used approach for characterizing patterns of gene expression. Currently used computational tools for cluster analysis are inadequate with respect to quantifying reproducibility of observed patterns. We propose to develop computational tools for efficient and reproducible extraction of biologically significant patterns from functional genomics data. Proposed computational procedures will be based on the Bayesian infinite mixture model. This approach allows for efficient use of information in the data and for assessing reproducibility of observed patterns. Precise modeling of uncertainty in cluster analysis will especially be beneficial when clusters of co-expressed genes are used as a starting point in characterizing the co-regulation of such genes. Joint modeling of functional genomics data and genomic regulatory sequences will facilitate optimal information exchange between these two data types. During the R21 portion of the grant, computational tools based on the Bayesian infinite mixture model will be validated. During the R33 portion, joint expression-sequence data models will be validated and all computational procedures will be incorporated in a user-friendly public domain software package. Key features of the software will be an intuitive graphical user interface and ability to directly access, manipulate and analyze diverse types of data. In addition to newly developed computational methods, the software will incorporate other relevant statistical techniques for correlating gene expression and cis-regulatory elements data. By using this software, biomedical researchers will be able to make reliable and reproducible conclusions about gene expression patterns and regulatory elements associate with these patterns.
Liu, X; Sivaganesan, S; Yeung, K Y et al. (2006) Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray dataset. Bioinformatics 22:1737-44 |
Yeung, Ka Yee; Medvedovic, Mario; Bumgarner, Roger E (2004) From co-expression to co-regulation: how many microarray experiments do we need? Genome Biol 5:R48 |
Medvedovic, M; Yeung, K Y; Bumgarner, R E (2004) Bayesian mixture model based clustering of replicated microarray data. Bioinformatics 20:1222-32 |