Multiple-species alignment is one of the techniques for deciphering the functions of regions within the human genome. Alignment of genomic sequence data from various species indicates which regions are most highly conserved, allowing identification of regions that perform some essential function, such as regulation of gene expression. Novel computational challenges must be overcome to produce the most informative alignments of multiple-species genomic data. For example, genomic alignments can be much longer than any alignment of protein sequences, and one must deal with several common families of repetitive elements and, perhaps, with differences in mode of evolution between coding and noncoding DNA. Full utility of alignment-generating programs requires that they be integrated into a software system that includes alignment analysis tools and various data management capabilities. We propose the following extensions to our ongoing development of strategies and software tools for multiple-species alignment of genomic sequence data. Our work with the O-like globin gene cluster will be extended to additional genomic loci. Laboratory experiments will be conducted to provide guidelines for choosing additional species and for evaluating the biological correctness of genomic alignments. Our existing alignment tools will be enhanced to more fully utilize biological knowledge, and a variety of supporting software tools will be produced. Finally, we will explore other sequence alignment problems that may be synergistically related to our genomic alignment research.
Showing the most recent 10 out of 66 publications