The investigators will develop and improve a DNA sequence assembly program for supporting large-scale DNA sequencing projects at public-funded genome centers. They developed a sequence assembly program named CAP3 in the initial project period. They propose to add new capabilities and make improvements to the CAP3 program. Specifically, the investigators will (1) improve the method for using forward-reverse constraints, (2) improve the method for generating consensus sequences, (3) add an option to support directed assembly in the sequence finishing phase, (4) add a capability to use databases of protein and EST sequences to order contigs produced in low-pass sequencing projects, and (5) add a capability to address alternative splicing patterns in assembly of EST sequences. They will work closely with Dr. Lee Hood's genome center. This relationship will ensure that their efforts on the assembly program will be relevant to real-world sequencing projects. The investigators will also assist the integration of their assembly program into sequencing environments at other genome centers.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG001502-07
Application #
6637425
Study Section
Genome Study Section (GNM)
Program Officer
Felsenfeld, Adam
Project Start
1996-08-01
Project End
2004-08-31
Budget Start
2003-03-01
Budget End
2004-08-31
Support Year
7
Fiscal Year
2003
Total Cost
$140,658
Indirect Cost
Name
Iowa State University
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
005309844
City
Ames
State
IA
Country
United States
Zip Code
50011
Huang, Xiaoqiu; Brutlag, Douglas L (2007) Dynamic use of multiple parameter sets in sequence alignment. Nucleic Acids Res 35:678-86
Wang, Jianmin; Huang, Xiaoqiu (2005) A method for finding single-nucleotide polymorphisms with allele frequencies in sequences of deep coverage. BMC Bioinformatics 6:220
Ye, Liang; Huang, Xiaoqiu (2005) MAP2: multiple alignment of syntenic genomic sequences. Nucleic Acids Res 33:162-70
Huang, Xiaoqiu; Ye, Liang; Chou, Hui-Hsien et al. (2004) Efficient combination of multiple word models for improved sequence comparison. Bioinformatics 20:2529-33
Huang, Xiaoqiu; Wang, Jianmin; Aluru, Srinivas et al. (2003) PCAP: a whole-genome assembly program. Genome Res 13:2164-70
Lin, Yaw-Ling; Huang, Xiaoqiu; Jiang, Tao et al. (2003) MAVG: locating non-overlapping maximum average segments in a given sequence. Bioinformatics 19:151-2
Huang, Xiaoqiu; Chao, Kun-Mao (2003) A generalized global alignment algorithm. Bioinformatics 19:228-33
Huang, X; Madan, A (1999) CAP3: A DNA sequence assembly program. Genome Res 9:868-77
Huang, X; Adams, M D; Zhou, H et al. (1997) A tool for analyzing and annotating genomic sequences. Genomics 46:37-45