The Committee on Models for Biomedical Research has proposed a """"""""matrix of biological knowledge"""""""" (Matrix) to address the need for a means to organize and intelligently access the wealth of information that confronts biomedical researchers (c.f. Models for Biomedical Research, National Academy Press, 1985). To implement the matrix will require a fusion of artificial intelligence (AI) and database technologies as well as an understanding of the structure of biological knowledge. However, a full implementation of the Matrix is best not attempted until pilot studies on a restricted domain of knowledge have demonstrated the feasibility and utility of the concept. One domain of considerable general interest is that of nucleic acid and protein sequence libraries, which have become invaluable research tools for a large segment of the biomedical research community. In addition to the libraries themselves a large number of computer programs have been written to analyze the sequence data, which has resulted in major new discoveries. The utility of these libraries to the typical laboratory investigator could be greatly enhanced by augmenting them with the types of knowledge management and reasoning systems envisioned for the Matrix. The goal of this proposed work is to build a testbed for the Matrix concept on top of the existing sequence libraries, benefitting both the larger goals of the Matrix and the immediate needs of sequence library users. Steps in this process will include: (1) implementation of the GenBank database in a relational form, interfaced to an AI environment; (2) creation of a series of knowledge bases relating to the database, starting with simple taxonomies and progression to more complex representations (on narrower test domains); (3) creation of a set of user interfaces oriented to the needs of molecular biologist; and (4) investigation of various techniques of inference against the knowledge base, in particular reasoning by analogy, with the ultimate aim of allowing users to build and test hypotheses locally.

Agency
National Institute of Health (NIH)
Institute
National Center for Research Resources (NCRR)
Type
Research Project (R01)
Project #
2R01RR004026-04
Application #
3421488
Study Section
Genome Study Section (GNM)
Project Start
1990-09-30
Project End
1991-09-29
Budget Start
1990-09-30
Budget End
1991-09-29
Support Year
4
Fiscal Year
1990
Total Cost
Indirect Cost
Name
Unisys
Department
Type
DUNS #
City
Paoli
State
PA
Country
United States
Zip Code
19301
Stoeckert, C; Pizarro, A; Manduchi, E et al. (2001) A relational schema for both array-based and SAGE gene expression experiments. Bioinformatics 17:300-8
Manduchi, E; Grant, G R; McKenzie, S E et al. (2000) Generation of patterns from gene expression data by assigning confidence to differentially expressed genes. Bioinformatics 16:685-98
Kolchanov, N A; Podkolodnaya, O A; Ananko, E A et al. (2000) Transcription regulatory regions database (TRRD): its status in 2000. Nucleic Acids Res 28:298-301
Phillips, R L; Ernst, R E; Brunk, B et al. (2000) The genetic program of hematopoietic stem cells. Science 288:1635-40
Kolchanov, N A; Ponomarenko, M P; Frolov, A S et al. (1999) Integrated databases and computer systems for studying eukaryotic gene expression. Bioinformatics 15:669-86
Babenko, V N; Kosarev, P S; Vishnevsky, O V et al. (1999) Investigating extended regulatory regions of genomic DNA sequences. Bioinformatics 15:644-53
Stoeckert Jr, C J; Salas, F; Brunk, B et al. (1999) EpoDB: a prototype database for the analysis of genes expressed during vertebrate erythropoiesis. Nucleic Acids Res 27:200-3
Ponomarenko, M P; Ponomarenko, J V; Frolov, A S et al. (1999) Oligonucleotide frequency matrices addressed to recognizing functional DNA sites. Bioinformatics 15:631-43
Ponomarenko, M P; Ponomarenko, J V; Frolov, A S et al. (1999) Identification of sequence-dependent DNA features correlating to activity of DNA sites interacting with proteins. Bioinformatics 15:687-703
Ponomarenko, J V; Ponomarenko, M P; Frolov, A S et al. (1999) Conformational and physicochemical DNA features specific for transcription factor binding sites. Bioinformatics 15:654-68

Showing the most recent 10 out of 16 publications