Comparison of bio-molecular sequences has become essential in modern molecular biology and biotechnology. This project will develop and test algorithmic techniques to explore DNA and protein sequences, with emphasis on techniques for newer or less analyzed biological phenomena. There are several areas of concentration: A fundamental re-examination of the biological basis for alignment models and parameters in light of the recently available sequence data and the accumulating understanding of mutation mechanisms; Expansion of parametric alignment methods to apply to a broader range of sequence analysis problems; Development of methods to organize a large set of alignments, in order to present alternative alignments without overwhelming the researcher; Formalizing notions of repeated substrings containing variations, and exploiting deep theorems from the study of exact repeats; Connecting the problem of protein structure alignment to a body of efficient techniques from Location Theory. Molecular sequence comparison is the essential complement of projects, such as the human genome project, that are obtaining the full DNA sequence transcript of various genomes. Comparison of sequences inside and across species has tremendously accelerated many tasks in molecular biology and biotechnology. The potential impact of this project will be in the development of more biologically based alignment models, in the production of novel software, and in the adaptation of mature mathematical and computational techniques to problems in molecular biology. Since sequence comparison has now become critical in molecular biology and in many aspects of commercial bio-technology and pharmaceutical companies, improved models and software for sequence comparison may have a significant impact on the way disease genes are identified, on the way newly identified genes are understood to function, and on the way new candidate molecular targets for medicines are identified. This work is funded by the Computational Biology Activity (BIO) and the program in the Theory of Computing (CISE).

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Application #
9723346
Program Officer
Gerald F. Guala
Project Start
Project End
Budget Start
1997-09-01
Budget End
2002-10-31
Support Year
Fiscal Year
1997
Total Cost
$357,899
Indirect Cost
Name
University of California Davis
Department
Type
DUNS #
City
Davis
State
CA
Country
United States
Zip Code
95618