Generalization of the GeneMark family of gene recognition programs of gene prediction to analysis of closely related genomes using Bayesian segmentation approach. This research will be done primarily in Russia as an extension of NIH grant #5R01HG00783.
The aim of this project is to build a global genomie alignment of two evolutionarily close genomes and to use a modification of the segmentation algorithm to parse syntenic regions of genome sequences. To this end the chain alignment of the extended regions of genomes will be constructed. Pairs of aligned sequences will be parsed into segments with different sequence variation statistics. A number of statistical models of genomic alignments will be built and the system, which automatically chooses the model relevant for the alignment of the particular region, will be designed. The final objective of the project is to develop a new algorithms of similarity based gene finding in the pairwise alignments of DNA sequences of closely related species. All the software programs will be available for users via the WWW interface.
Boeva, Valentina; Regnier, Mireille; Papatsenko, Dmitri et al. (2006) Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression. Bioinformatics 22:676-84 |
Kattenhorn, Lisa M; Mills, Ryan; Wagner, Markus et al. (2004) Identification of proteins associated with murine cytomegalovirus virions. J Virol 78:11187-97 |