This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. It is widely agreed that expression array technologies, when carefully applied, can effectively uncover the gene or protein expression program of a cell. In the best of cases, even relatively small shifts in fairly scarce proteins can be detected. At the same time, the degree of experimental and natural variability of these measurements is not yet well understood. Many labs have reported substantial difficulty in achieving reproducible results. In an ideal world, replication at every experimental condition would provide an independent, protein-specific variability estimate that could be used to give confidence bounds on any downstream results that are obtained. In practice, expression array technologies are far too expensive to permit casual replication. Statistical techniques allow us to borrow strength across different experiments, allowing us to more economically ascribe confidence to our substantive conclusions. Normalization has been one of the trickier aspects of preprocessing for 2D gels, due to uncontrollable variation in the conditions under which the hybridization is carried out, some of the gels will be regionally or even globally brighter than others. The standard normalization procedure in both cases has been ratio normalization in log-space to a set of control proteins. This normalization method will be used as a baseline one. However, we intend to examine several more refined approaches to normalization, since we are guided by the notion that a significant shift in the distribution of abundances of cellular proteins is unlikely to be compatible with viability. Whenever such shifts are detected, for instance through the use of quantile-quantile plots (QQ plots), a warning should be raised, and corrective measures should be considered. Two approaches that are popular in digital image processing are quantile regression and histogram normalization. Quantile normalization procedures, where the sorted average intensities have been adjusted to more nearly match that of standard chips by making 100 individual quantiles have the same values, using a linear spline model (i.e. a monotone piece-wise linear transformation), have been applied successfully by the co-PI for gene expression data (Li, et al., 2002). The next logical step in preprocessing is, especially in the presence of duplicate gels for the same experimental condition, to determine the effects of the different sources of variation present in the data and try to correct them wherever it is appropriate. For this task, analysis of variance models would be fitted that provided a consistent approach for estimating effect sizes, determining their significance and correcting for them. For example, the fitted model can be used to eliminate the first-order component of nuisance factors such as the effects due to technical aspects of the assay technology, as well as testing the significance of factors of primary scientific interest (different stimuli levels).
Showing the most recent 10 out of 56 publications