A tandem repeat is an occurrence of two or more adjacent, often approximate copies of a sequence of nucleotides. Tandem repeats have known functional roles, including coding with loss of function, switching, and acting as modifiers of gene expression. Tandem repeats are primary components of chromosomal structures. They are useful for genetic linkage analysis, bacterial strain typing, DNA fingerprinting and studies of changes in DNA over short time scales.

Identification of tandem repeats has been made easier by new software that processes the entire genome. The rapid analysis permits identification and annotation of repeats, clustering into families for further study. A multi-genome Tandem Repeats Database will bring together information about repeats as well as serving as the platform for development of new tools. These include algorithms to compare and cluster repeats, as well as for identifying predictive criteria for copy number polymorphisms. This will further enable annotation of repeats and of repeat families, including genomic environment, copy number polymorphisms, whole genome properties and family properties. All of this will be available through a web site with integrated data visualization and data model specification for transfer to other formats.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Application #
0413462
Program Officer
Manfred D. Zorn
Project Start
Project End
Budget Start
2003-09-01
Budget End
2005-08-31
Support Year
Fiscal Year
2004
Total Cost
$704,967
Indirect Cost
Name
Boston University
Department
Type
DUNS #
City
Boston
State
MA
Country
United States
Zip Code
02215