Thirty megabases of DNA sequence from the human X chromosome will be generated over three years at a cost of less than 35 cents per finished base. The data will be of high quality, with less than one error per 10,000 nucleotides. High resolution cosmid maps from the Xq28 and Xp22 regions will be generated by automated restriction endonuclease digestion mapping. Selected cosmids will be analyzed by a balanced random and directed strategy based upon double ended sequencing of subclones that will minimize redundancy and dependence on expensive oligonucleotide walks, and avoid excessive redundant sequencing in the overlapping regions. Gaps and ambiguities in the cosmid maps will be solved by isolating alternative genomic clones, and by coordinating ongoing partial sequencing with the mapping. New technologies to improve large scale sequencing will be developed. Simplified construction of high quality shotgun libraries and a rapid DNA template preparation scheme will allow partial automation of front end steps. Novel fluorescent dye labeled oligonucleotide primers will enable streamlined sequence reactions and the resulting signal enhancements will simplify DNA base calling. The informatics infrastructure for the coordination of all steps from the cosmid mapping, DNA sequencing, data assembly, validation, annotation and database submission will be streamlined to allow higher throughput. In addition to the DNA sequence, this project will generate a model for an expanded program that can allow timely sequencing of the whole human genome.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG001459-02
Application #
2392523
Study Section
Special Emphasis Panel (SRC (01))
Project Start
1996-04-11
Project End
1999-06-30
Budget Start
1997-04-01
Budget End
1998-06-30
Support Year
2
Fiscal Year
1997
Total Cost
Indirect Cost
Name
Baylor College of Medicine
Department
Genetics
Type
Schools of Medicine
DUNS #
074615394
City
Houston
State
TX
Country
United States
Zip Code
77030
Bouck, J; McLeod, M P; Worley, K et al. (2000) The human transcript database: a catalogue of full length cDNA inserts. Bioinformatics 16:176-7
Bouck, J; Yu, W; Gibbs, R et al. (1999) Comparison of gene indexing databases. Trends Genet 15:159-62
Bouck, J; Miller, W; Gorrell, J H et al. (1998) Analysis of the quality and utility of random shotgun sequencing at low redundancies. Genome Res 8:1074-84
Muzny, D M; Metzker, M L; Bouck, J et al. (1998) Using BODIPY dye-primer chemistry in large-scale sequencing. IEEE Eng Med Biol Mag 17:88-93
Ansari-Lari, M A; Oeltjen, J C; Schwartz, S et al. (1998) Comparative sequence analysis of a gene-rich cluster at human chromosome 12p13 and its syntenic region in mouse chromosome 6. Genome Res 8:29-40
Metzker, M L; Raghavachari, R; Burgess, K et al. (1998) Elimination of residual natural nucleotides from 3'-O-modified-dNTP syntheses by enzymatic mop-up. Biotechniques 25:814-7
Huq, A H; Sutcliffe, J S; Nakao, M et al. (1997) Sequencing and functional analysis of the SNRPN promoter: in vitro methylation abolishes promoter activity. Genome Res 7:642-8
Ansari-Lari, M A; Shen, Y; Muzny, D M et al. (1997) Large-scale sequencing in human chromosome 12p13: experimental and computational gene structure determination. Genome Res 7:268-80
Schaefer, L; Prakash, S; Zoghbi, H Y (1997) Cloning and characterization of a novel rho-type GTPase-activating protein gene (ARHGAP6) from the critical region for microphthalmia with linear skin defects. Genomics 46:268-77
Oeltjen, J C; Malley, T M; Muzny, D M et al. (1997) Large-scale comparative sequence analysis of the human and murine Bruton's tyrosine kinase loci reveals conserved regulatory domains. Genome Res 7:315-29