Electrophoresis can resolve thousands of proteins, but analysis of such data is extremely difficult. Automated methods for detection and recognition of proteins on two-dimensional (2D) electrophoretograms are needed. Image processing techniques and expert systems have been tried, but these are computationally intensive, and require heuristic rule bases. In Phase I and on a recent IR&D project, ORINCON has developed a protein identification system based on automated image spot correspondence processing and the use of a hierarchy of neural nets for recognition of protein distribution in 2D electrophoretograms. This system demonstrates the feasibility of differentiating and correctly classifying different clinical conditions in 2D electrophoretograms. In Phase II we will further develop these techniques to handle large numbers-of image spots for an expanded set of known proteins to enable analysis of entire 2D electrophoresis biological samples. The technique will be tested on a large data set collected by Scripps Clinic and Large Scale Biology (LSB) Corporation. This would lead to development of a commercial package for automated protein analysis of 2D electrophoretograms. It could easily be modified to categorize other constituents (e.g., amino and nucleic acids) via analysis of appropriate data sets.
The research should lead to development of a commercial package for automated processing of 2D electrophoresis protein data. Applications of this technology would be widespread; in addition to laboratories, in which electrophoresis is common, the technology could be applied to clinical settings, in which patient protein analysis would be a very important diagnostic tool for differentiation between normal and diseased states. This technique can potentially be used as a cancer screening tool. Over one million cases of cancer are diagnosed in the U.S. every year.