Human and mouse genomic sequences will be aligned and annotated using the World Wide Web server PipMaker. This method of analysis easily identifies exons of orthologous genes. Furthermore, conserved noncoding regions can be visualized as high scoring segment pairs in a pip display. Thus, regulatory elements that are conserved via evolutionary pressure will also be annotated in this study. Specifically, alignments of syntenic human and mouse sequences will be computed and placed in a database for public access. Regulatory elements within aligned regions will be grouped based on the density of conserved noncoding sites. These groupings will be used as datasets to evaluate whether the use of defined thresholds for identifying important regulatory elements is more informative than the use of phylogenetic distance, which may find either too many conserved noncoding regions, or not enough. In addition to annotating genes from genomic sequence data (facilitating the elucidation of genes that contribute to human genetic disease) the purpose of this study is to map the regulatory elements for many of these genes. Thus, critical targets for experimental study will be identified. These results will be shared with researchers interested in studying regulated expression of genes within the aligned sequences. An additional outcome of this study is to provide datasets of aligned sequences to computer scientists interested in improving ab initio methods of identifying transcription factor binding sites in genomic sequences.
Showing the most recent 10 out of 13 publications