The major objective of this project is to develop algorithms and software for performing automated annotation of the coding regions of a large genomic DNA sequence. The investigators will improve an analysis and annotation tool (AAT) that uses fast database searching and rigorous alignment to locate exons of the genomic sequence and to define intron-exon boundaries. The new annotation software will be developed by integrating the improved AAT tool with gene prediction programs. The annotation software assembles exons produced by the improved AAT tool and exons predicted by the gene prediction programs into gene structures. Some of the exons produced by the improved AAT are used as constraints in the assembly. Another goal of this project is to develop a rigorous program for producing an optimal alignment between two DNA sequences. A novel feature about the program is that the coding frame information will be incorporated into the alignment model. An optimal alignment between two DNA sequences produced by the program shows the correspondence of the codons of the sequences. Thus, the alignment is also meaningful when the codons are translated into amino acids.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
1R01HG001676-01
Application #
2439715
Study Section
Genome Study Section (GNM)
Project Start
1998-01-01
Project End
2000-12-31
Budget Start
1998-01-01
Budget End
1998-12-31
Support Year
1
Fiscal Year
1998
Total Cost
Indirect Cost
Name
Michigan Technological University
Department
Type
Organized Research Units
DUNS #
065453268
City
Houghton
State
MI
Country
United States
Zip Code
49931
Wang, Jianmin; Huang, Xiaoqiu (2005) A method for finding single-nucleotide polymorphisms with allele frequencies in sequences of deep coverage. BMC Bioinformatics 6:220
Ye, Liang; Huang, Xiaoqiu (2005) MAP2: multiple alignment of syntenic genomic sequences. Nucleic Acids Res 33:162-70
Huang, Xiaoqiu; Ye, Liang; Chou, Hui-Hsien et al. (2004) Efficient combination of multiple word models for improved sequence comparison. Bioinformatics 20:2529-33
Huang, Xiaoqiu; Wang, Jianmin; Aluru, Srinivas et al. (2003) PCAP: a whole-genome assembly program. Genome Res 13:2164-70
Huang, Xiaoqiu; Chao, Kun-Mao (2003) A generalized global alignment algorithm. Bioinformatics 19:228-33