We propose to design efficient computer algorithms providing novel and/or improved methods and software for a number of computational problems in molecular biology. The proposal and the principal investigator's current research efforts are divided into three projects, as follows. The first project centers on computational problems in the sequencing of DNA and physical mapping of genomes. We propose to continue refining our algorithms and software library for the fragment assembly problem, i.e., determining the most likely complete DNA sequence consistent with electrophoresis data gathered from cloned fragments. The refinements consist of improved algorithms for all phases of the computation: ultra- rapid overlap detection, assembly in the presence of constraints modeling additional experimental information, a formulation of the problem that correctly handles repetitive sequence, and a multi-alignment component that accommodates base-calling quality figures. We also propose work on ordered shotgun sequencing (OSS) and a cDNA database for the data being generated at Washington University under contract to Merck. Lastly, there is much similarity between fragment assembly and physical mapping, save that the relative level of experimental error is higher. We propose an algorithm for STS content mapping based on rules-of-inference that are true with very high probability. The second project is to design better algorithms for a number of computational problems arising in molecular biology. Progress in this arena tends to be inspired rather than calculated. We demonstrate our track record of producing interesting results and then describe the following problems for which we have a number of ideas and preliminary results: sublinear Smith-Waterman database searches, determining restriction maps from digest data, grammar-based pattern matching, selecting PCR primers, predicting RNA secondary structure, and docking rigid molecules (3D matching). The final project involves the introduction into our funded research activities of a new area, molecular graphics. Earlier, we developed MacMolecule 1.7 which rendered space-filling, ball-and-stock, and wire- frame views of molecules. We estimate 10,000 copies are currently in use around the world.
Our aim i s high-speed, high-quality graphics, on low- end machines achieved by virtue of very efficient rendering algorithms. We have begun to develop a new version of MacMolecule, called Linus 2.0, which will deliver superior performance and visualizations along with greater capabilities, including ribbon renderings of protein secondary structure, zoom, hi-lighting, picking, and additional visualization modes. We further propose to build a helper application supporting """"""""Linus"""""""" content files so that Linus may be used with the World-Wide Web. In further years, we will support molecular animation, construction, and possibly dynamics.
Jain, M; Myers, E W (1997) Algorithms for computing and integrating physical maps using unique probes. J Comput Biol 4:449-66 |
Myers, E W (1996) Approximate matching of network expressions with spacers. J Comput Biol 3:33-51 |
Myers, G; Selznick, S; Zhang, Z et al. (1996) Progressive multiple alignment with constraints. J Comput Biol 3:563-72 |
Jain, M; Myers, E W (1995) A note on scoring clones given a probe ordering. J Comput Biol 2:33-7 |
Myers, E W (1995) Toward simplifying and accurately formulating fragment assembly. J Comput Biol 2:275-90 |
Myers, E W; Huang, X (1992) An O (N2 log N) restriction map comparison and search algorithm. Bull Math Biol 54:599-618 |
Altschul, S F; Gish, W; Miller, W et al. (1990) Basic local alignment search tool. J Mol Biol 215:403-10 |