Micro array analysis of genome expression data is an emerging technology that allows biologists to measure patterns of gene expression across a large portion of a single genome in one measurement. The technology is currently in the process of commercialization, and it is expected that organizations that adopt the technology will accumulate large databases of results of micro array analyses. The very high dimensionality of micro array data means that existing data base indexing schemes are not well suited to the problem of providing efficient content based searches of large data bases of microarray data. The work proposed here will develop a new database indexing scheme that is specifically designed for efficient searches of microarray databases. The innovative feature of the design is the combination of a feature map, which maps the very high dimensional data space onto an feature space of lower dimensionality, and an index on the lower dimensional feature space that uses a binary space partition tree to organize the geometry of the feature space.
AIMS has successfully applied a similar methodology to indexing problems involving data bases containing high dimensional signal data from radar pulses. The software that was developed for these applications would form the basis for the software that would be implement the micro array index. To investigate the performance of the proposed indexing scheme a model using metabolic networks will be developed which will be designed so as to produce synthetic micro array data with statistical characteristics similar to those of actual experimental data.

Proposed Commercial Applications

This research will provide a fundamental database component that will increase the usefulness of large databases of micro array assays of genome expression in tissue. No good alternative technologies currently exist. Any company wishing to develop a system that will efficiently query the information content in large microarray databases will require an indexing scheme such as the one proposed here.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Small Business Innovation Research Grants (SBIR) - Phase I (R43)
Project #
1R43HG002055-01A1
Application #
6144561
Study Section
Special Emphasis Panel (ZRG1-SSS-Y (01))
Program Officer
Feingold, Elise A
Project Start
2000-05-10
Project End
2001-01-31
Budget Start
2000-05-10
Budget End
2001-01-31
Support Year
1
Fiscal Year
2000
Total Cost
$100,000
Indirect Cost
Name
Aims, Inc.
Department
Type
DUNS #
City
Rockville
State
MD
Country
United States
Zip Code
20852