We propose to design efficient computer algorithms providing novel and/or improved methods and software for a number of computational problems in molecular biology. the emphasis will b eon blending theoretical results with practical concerns. All software emerging from the project will be made available free of charge. The proposal and the principal investigator's current research efforts are divided into three projects. The first project centers on computational problems in the physical mapping and sequencing of DNA. We propose to continue refining a software system for the fragment assembly problem, i.e., determine the most likely complete DNA sequence consistent with electrophoresis data gathered from cloned fragments. The refinements consist of improved algorithms for a number of the phases of the computation, expanding the functionality to support user- interaction, and developing a complete environment to support megabase sequencing projects. The methods we developed for fragment assembly also apply in large part to the problem of determining physical maps via various fingerprinting techniques. We have formulated a generalized assembly problem and plan to build a system that is capable of solving such problems for any combination of restriction map, digest, and hybridization information about the clones. The second project is to design algorithms for a number of computational problems arising in molecular biology. Progress in this arena tends to be inspired rather than calculated. We demonstrate our track record of producing interesting results and then describe the following problems for which we have a number of ideas and preliminary results: sublinear similarity searches, restriction map comparison, super pattern matching (gene recognition), determining restriction maps from digest data, designing oligonucleotide probes, and RNA secondary structure prediction. The objective of the last project is to develop a pattern matching system permitting the expression of complex patterns and their reduction to efficient search strategies. The pattern specification language is simple yet powerful enough to succinctly express the most complex patterns of biological interest. An """"""""expert system"""""""" compiler for the language will examine a pattern and will choose a search strategy or combination of strategies from a built-in library of basic search techniques. The build- in library will contain implementations of the best available search algorithms for exact and approximate matches to keywords, repeats, and regular expressions. Using a dynamic-programming style calculation the expert compiler chooses the optimal backtracking strategy over the basic library searches. We have proven the efficacy of this approach on a small prototype for a subset of the language that is useful for specifying protein motifs. We now propose to embark upon the construction of a complete system.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
5R01LM004960-06
Application #
2237671
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Project Start
1988-09-01
Project End
1995-12-31
Budget Start
1994-01-01
Budget End
1994-12-31
Support Year
6
Fiscal Year
1994
Total Cost
Indirect Cost
Name
University of Arizona
Department
Type
Other Domestic Higher Education
DUNS #
City
Tucson
State
AZ
Country
United States
Zip Code
85721
Jain, M; Myers, E W (1997) Algorithms for computing and integrating physical maps using unique probes. J Comput Biol 4:449-66
Myers, E W (1996) Approximate matching of network expressions with spacers. J Comput Biol 3:33-51
Myers, G; Selznick, S; Zhang, Z et al. (1996) Progressive multiple alignment with constraints. J Comput Biol 3:563-72
Jain, M; Myers, E W (1995) A note on scoring clones given a probe ordering. J Comput Biol 2:33-7
Myers, E W (1995) Toward simplifying and accurately formulating fragment assembly. J Comput Biol 2:275-90
Myers, E W; Huang, X (1992) An O (N2 log N) restriction map comparison and search algorithm. Bull Math Biol 54:599-618
Altschul, S F; Gish, W; Miller, W et al. (1990) Basic local alignment search tool. J Mol Biol 215:403-10