This research focuses on the development of computational tools for analyzing and comparing sequences. An important aspect of this work is the collaboration with scientists in the biological sciences. Particular emphasis will be placed on the development of computational tools for the detection of multiple repeated regions and tandem repeats in both DNA and protein sequences. These regions are very important as they have been associated with regulatory, stabilizing, and evolutionary sites. Many such repeated regions are difficult to detect because the repetitions are inexact, sometimes exhibiting a low percentage of identical symbols. An analysis will be conducted to gain more insight into the difficult problem of determining the threshold for statistical significance of such repeats. The efficient identification of multiple repeats requires a subtle similarity measure. This will require the identification and representation of optimal as well as near optimal alignments between subsequences. An empirical comparative study will be conducted to assess the applicability and significance of the new algorithms. Interactive activities include teaching a seminar to encourage a dialogue between students from different disciplines; visiting local high schools; organizing group meetings with undergraduate interns; and interacting with the graduate students; participating in journal clubs, colloquium and seminars; and engaging in joint research with other members of the Biochemistry, Medical Informatics and Genome Center at Stanford University.

Agency
National Science Foundation (NSF)
Institute
Division of Human Resource Development (HRD)
Type
Standard Grant (Standard)
Application #
9627109
Program Officer
Margrete S. Klein
Project Start
Project End
Budget Start
1996-09-15
Budget End
1997-08-31
Support Year
Fiscal Year
1996
Total Cost
$160,565
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Palo Alto
State
CA
Country
United States
Zip Code
94304