The major objective of this project is to develop algorithms and software for performing automated annotation of the coding regions of a large genomic DNA sequence. The investigators will improve an analysis and annotation tool (AAT) that uses fast database searching and rigorous alignment to locate exons of the genomic sequence and to define intron-exon boundaries. The new annotation software will be developed by integrating the improved AAT tool with gene prediction programs. The annotation software assembles exons produced by the improved AAT tool and exons predicted by the gene prediction programs into gene structures. Some of the exons produced by the improved AAT are used as constraints in the assembly. Another goal of this project is to develop a rigorous program for producing an optimal alignment between two DNA sequences. A novel feature about the program is that the coding frame information will be incorporated into the alignment model. An optimal alignment between two DNA sequences produced by the program shows the correspondence of the codons of the sequences. Thus, the alignment is also meaningful when the codons are translated into amino acids.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
7R01HG001676-03
Application #
6209659
Study Section
Genome Study Section (GNM)
Program Officer
Feingold, Elise A
Project Start
1998-01-01
Project End
2000-12-31
Budget Start
1999-08-28
Budget End
1999-12-31
Support Year
3
Fiscal Year
1999
Total Cost
Indirect Cost
Name
Keck Graduate Institute of Applied Life Scis
Department
Type
DUNS #
011116907
City
Claremont
State
CA
Country
United States
Zip Code
91711
Wang, Jianmin; Huang, Xiaoqiu (2005) A method for finding single-nucleotide polymorphisms with allele frequencies in sequences of deep coverage. BMC Bioinformatics 6:220
Ye, Liang; Huang, Xiaoqiu (2005) MAP2: multiple alignment of syntenic genomic sequences. Nucleic Acids Res 33:162-70
Huang, Xiaoqiu; Ye, Liang; Chou, Hui-Hsien et al. (2004) Efficient combination of multiple word models for improved sequence comparison. Bioinformatics 20:2529-33
Huang, Xiaoqiu; Wang, Jianmin; Aluru, Srinivas et al. (2003) PCAP: a whole-genome assembly program. Genome Res 13:2164-70
Huang, Xiaoqiu; Chao, Kun-Mao (2003) A generalized global alignment algorithm. Bioinformatics 19:228-33