This proposal seeks to maintain and further develop an object oriented software framework to represent and access macromolecular structure data of proteins, DNA, RNA. and their complexes. In this context a software framework implies a set of reusable software components that encompass the storage, intuitive, efficient retrieval and processing of macromolecular structure-based information leading to a better understanding of structure-function relationships. A continuing major design goal is the ability to quickly combine and apply these components in hitherto unthought of ways. An approach of particular value to the fast evolving domain of structural biology where the amount of data, our knowledge about that data, and presumably the new questions that data raises, continues to grow at a near exponential rate. The existing framework, based on PDBlib, a ~++ class library for representing a macromolecular structure, already interfaces to a variety of backend storage formats including Protein Data Bank (PDB) ASCII files. Object Oriented PDB (OOPDB - a persistent version of PDBlib) and a Derived Features DataBase (DFDB). Query of all backend storage formats is available through MacroMolecular Query Language (MMQL), a domain specific non-formal query methodology. More limited query of OOPDB and DFDB and reports generated from these databases are available globally through the Macromolecular Object Oriented Search Engine (MOOSE), a World Wide Web (WWW) server. This proposal seeks to maintain and expand all facets of the framework. The specific goals of this proposal are as follows: Maintain and make available through the resources of the San Diego Supercomputer Center (SDSC) the existing and evolving framework. This implies: providing access to current native and derived data, database maintenance, software maintenance, software distribution, and limited assistance to the community of users. Extend the framework to: 1. Support a broader range of information on biological macromolecules through extensions to PDBlib. Namely, functional classification of macromolecules, primary sequence where no 3-D structure exists, NMR ensemble d~ta, and experimental data including that from sources other than X-ray and NMR. 2. Provide new query methods (predominantly for comparative analysis) to effectively use existing and new data by extensions to MMQL. 3. Provide the ability to store and subsequently operate on the results of previous queries as specialized databases. Optimize the storage and query methodology used by the framework. Continue in-house research using the framework to explore correlations between various structural parameters found in protein structures. While it is far from proven that the community will use and further develop software frameworks such as the one proposed here, the rationale behind it would seem sound in a time when the falling cost of hardware has not been matched by a significantly decreased cost in software development.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Type
Standard Grant (Standard)
Application #
9507625
Program Officer
Paul Gilna
Project Start
Project End
Budget Start
1995-09-15
Budget End
1997-06-30
Support Year
Fiscal Year
1995
Total Cost
$141,007
Indirect Cost
Name
General Atomics
Department
Type
DUNS #
City
San Diego
State
CA
Country
United States
Zip Code
92121