The field of proteomics seeks to fully understand all proteins expressed by the genome of an organism. Experiments performed on biological samples use a variety of analytical techniques to profile and characterize the expressed protein set. For discovery research purposes proteomic scientists require that results produced from such experiments be integrated and reviewed along relevant functional protein annotations available from proprietary and public sources. The long-term goal of this project is to design an open proteomic decision support system (PDSS) that will support this type of computer assisted discovery research. This Phase I proposal is for the analysis and design of a data warehouse model that will: (1) support the addition of biological meaning to experimental data; (2) allow scientists to discover important relationships between expressed proteins and biological function in health and disease; (3) increase confidence in experimental findings by evidence compilation; and (4) provide an integrated proteomic repository that will facilitate data mining efforts in drug discovery.
The specific aims of Phase I are: (i) develop initial requirements for PDSS; (ii) develop object model; (iii) design data warehouse; and (iv) prototype, test, and evaluate. Theoretical models of the data warehouse will be designed from which prototypes will be generated. The proteomics group of a collaborating pharmaceutical company will be used as a case study for prototype evaluation. Success will be measured by the ability of PDSS to answer questions of scientific interest to the proteomic community easily and quickly. Potential problems arising from data availability and evolving data sources will be addressed. Phase II will expand the model and design the application and user interface layers of the PDSS software system.