The investigator studies three new coding-theoretic paradigms arising during the processes of genetic data acquisition, analysis, and modeling. The biological principles governing the systems for which coding solutions are sought are DNA and RNA sequence hybridization and self-hybridization. The bio-chemical property supporting hybridization is the affinity of bases in single DNA and RNA strands to form hydrogen bonds with their complementary bases, defined in terms of the Watson- Crick rule. By forming such bonds, paired bases generate planar or spatial structures that are comprised of two complementary strands or one single strand. Bonded structures have increased stability, but they also serve an important role in regulating various cellular functions, including pre-mRNA editing or post-transcriptional gene silencing. In certain cases, specific self-hybridization patterns in DNA sequences represent precursors to sequence breakage and are closely associated with genetic diseases such as cancer.

Besides its chemical and physical properties, the process of sequence hybridization has distinctly combinatorial features. These combinatorial features are used by the investigator to establish a rigorous mathematical framework in which to analyze technological and biological systems operating on the principle of sequence hybridization. Several classical coding schemes are analyzed in new biological settings, including superimposed designs, balanced codes, and run-length constrained codes. Such schemes are generalized, combined, and optimized for a given application. Furthermore, some new coding-theoretic and algorithmic problems are introduced that lead to challenging and interesting new research directions in algebraic coding theory. The outlined interdisciplinary research efforts are expected to have significant impact on the development of cDNA and aptamer microarrays and are also expected to lead to the creation of a new educational program at the University of Colorado, Boulder.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
0644427
Program Officer
John Cozzens
Project Start
Project End
Budget Start
2007-02-01
Budget End
2008-02-29
Support Year
Fiscal Year
2006
Total Cost
$151,070
Indirect Cost
Name
University of Colorado at Boulder
Department
Type
DUNS #
City
Boulder
State
CO
Country
United States
Zip Code
80309