In recent years, the development and availability of omic-based technologies has moved analytical research to the forefront of biology. A desirable approach to systems-level biology research is to iterate between computation and experimentation. Explicitly, by using computational, statistical, and visualization-based techniques to interrogate the data, new experimental hypotheses can be developed and subsequently tested in the laboratory. However, the volume and heterogeneity of data being generated by high-throughput methods has created a need to develop improved methods for data integration and interpretation. The focus of this proposal is the continued development and maintenance of our existing visual analytics software: Platform for Proteomics Peptide and Protein data exploration (PQuad), a multi-resolution environment that can currently integrate genomic and proteomic data for complex prokaryotic datasets. PQuad currently has the capability to identify differentially expressed peptides and proteins between two experiments, and perform basic data integration of categorical information. The interrogation of multiple lines of evidence in prokaryotic systems has immediate significance for identifying virulence determinants in pathogens. We propose to continue the development of PQuad in two core areas: (1) advanced user-interaction and (2) enhanced visualizations.
Specific Aim #1 : The development of an advanced user-interface that will guide users in uploading multiple sources of information (both experimental and metadata), performing queries to target specific biomolecules of interest, and export specific queries of interest for further exploration outside of PQuad. In addition, we will offer the ability to perform basic statistical analyses of MS-based proteomic peptide identifications that can be used for thresholding queries and visualizations.
Specific Aim #2 : The development of new visualizations to support analysis and integration of data sources and queries. New visual paradigms will be incorporated into the software, which are not genome-centric, but targeted at facilitating the biological interpretation of available data sources or specific queries as defined in Aim 1. Through collaboration with users associated with one of the NIAID-funded Biodefense Proteomics Research Centers (www.proteomicsresource.org/PRC/About.aspx), we will demonstrate the data integration capabilities with the end goal of virulence determinant discovery in Salmonella