In the post-genomic era, proteomics has become one of the most important research topics of modern science, opening new doors and potentially influencing medical science for years to come. By separating, cataloging, and comparing proteins from normal and diseased cells and tissues, we gain invaluable knowledge about changes taking place in complex biological systems at the molecular level, which in turn leads to better diagnostics and therapeutics. Two-dimensional gel electrophoresis, used in conjunction with a protein identification method such as mass spectrometry, could provide the front end for comparing protein expression in large collections of samples. Even though technology is evolving as we speak, current methods of protein separation are riddled with noise which makes separation and subsequent identification extremely difficult. The role of rigorous statistical analysis that incorporates the inherent uncertainty in the separation techniques has been acknowledged for many years. Yet, the literature is surprisingly silent on methodologies that properly address the question of statistical bias and variation in 2D gel and mass spectrometry data. Implemented methods are, on the whole, simplistic and ad hoc. The present proposal aims at developing sophisticated mathematical and statistical tools for removing analytical bottlenecks. Gel alignment techniques will be developed that obviate the need for subjectively designating one of the gels as the reference gel and thus reduce the bias in subsequent statistical analyses. The new alignment techniques will be fully automatized requiring no, or very little, human intervention. They will rely on modern interior point methods far quadratic and nonlinear programming. Independent component analysis (ICA) will be applied for identifying latent biologically relevant factors that contribute to differential protein expressions. Methods for performing statistical hypothesis tests based on independent components will also be developed. Finally, new predictive classification methods will be used to investigate relationships between phenotypes and protein profiles generated by 20 gel and mass spectrometry data. The deliverables of our research are a set of broadly applicable theoretical and computational tools that will have a direct impact on proteomics. The general research strategy is: I) to assemble an initial set of tools; 2) to initiate proteomics studies using the available tools; 3) to continue to develop and refine these tools and to tackle more complex applications as the tools become more sophisticated. The results of the proposed research, such as the new methods for image registration, the ICA based methods for statistical hypothesis tests, and the predictive classification methods, are clearly relevant to areas outside the field of proteomics as well.
Li, Feng; Seillier-Moiseiwitsch, Francoise (2011) Analyzing 2D gel images using a two-component empirical Bayes model. BMC Bioinformatics 12:433 |
Li, Feng; Seillier-Moiseiwitsch, Françoise; Korostyshevskiy, Valeriy R (2011) Region-based Statistical Analysis of 2D PAGE Images. Comput Stat Data Anal 55:3059-3072 |
Safavi, Haleh; Correa, Nicolle; Xiong, Wei et al. (2008) Independent component analysis of 2-D electrophoresis gels. Electrophoresis 29:4017-26 |
Tang, Yongqiang; Ghosal, Subhashis; Roy, Anindya (2007) Nonparametric bayesian estimation of positive false discovery rates. Biometrics 63:1126-34 |
Potra, Florian A; Liu, Xing (2006) Aligning families of two-dimensional gels by a combined multiresolution forward-inverse transformation approach. J Comput Biol 13:1384-95 |
Potra, Florian A; Liu, Xing; Seillier-Moiseiwitsch, Francoise et al. (2006) Protein image alignment via piecewise affine transformations. J Comput Biol 13:614-30 |