We will determine the complete nucleotide sequence of the Escherichia coli chromosome and study and analyze the complete sequence. We will appraise, develop and implement improved sequence gathering methodology; explore and develop various sequencing strategies; and use and develop analysis programs and algorithms. In the past decade the DNA sequences of virus and organelle genomes of increasing size and complexity have been determined. The largest genome to have been sequenced to date is that of Epstein-Barr virus (182 kilo base pairs). We feel that the total sequencing of a free living life-form such as Escherichia coli (5 million base pairs) would be an appropriate next step that will be both technically feasible and scientifically rewarding. The complete sequence will provide a unique opportunity to analyze physical, genetic and organizational features of the whole genome. We will be able to make global statements about the genome's physical structure, its size, base content and distribution, (frequency and size of direct and inverted repeats, and the locations of potential loops, bends or Z-DNA. At the level of genetic organization, we will look for families of related genes and analyze their distribution in the genome. Besides developing a resource of biological information of inestimatable value in its own right, the sequencing techniques and methodology and the programs and algorithms for analysis developed in this project will have important applications for other large sequencing projects.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
8R01HG000301-03
Application #
3333364
Study Section
Special Emphasis Panel (SSS (A))
Project Start
1988-07-01
Project End
1991-08-31
Budget Start
1990-07-01
Budget End
1991-08-31
Support Year
3
Fiscal Year
1990
Total Cost
Indirect Cost
Name
University of Wisconsin Madison
Department
Type
Schools of Earth Sciences/Natur
DUNS #
161202122
City
Madison
State
WI
Country
United States
Zip Code
53715
Burland, V; Plunkett 3rd, G; Sofia, H J et al. (1995) Analysis of the Escherichia coli genome VI: DNA sequence of the region from 92.8 through 100 minutes. Nucleic Acids Res 23:2105-19
Sofia, H J; Burland, V; Daniels, D L et al. (1994) Analysis of the Escherichia coli genome. V. DNA sequence of the region from 76.0 to 81.5 minutes. Nucleic Acids Res 22:2576-86
Burland, V; Plunkett 3rd, G; Daniels, D L et al. (1993) DNA sequence and analysis of 136 kilobases of the Escherichia coli genome: organizational symmetry around the origin of replication. Genomics 16:551-61
Chuang, S E; Burland, V; Plunkett 3rd, G et al. (1993) Sequence analysis of four new heat-shock genes constituting the hslTS/ibpAB and hslVU operons in Escherichia coli. Gene 134:1-6
Blattner, F R; Burland, V; Plunkett 3rd, G et al. (1993) Analysis of the Escherichia coli genome. IV. DNA sequence of the region from 89.2 to 92.8 minutes. Nucleic Acids Res 21:5408-17
Plunkett 3rd, G; Burland, V; Daniels, D L et al. (1993) Analysis of the Escherichia coli genome. III. DNA sequence of the region from 87.2 to 89.2 minutes. Nucleic Acids Res 21:3391-8
Burland, V; Daniels, D L; Plunkett 3rd, G et al. (1993) Genome sequencing on both strands: the Janus strategy. Nucleic Acids Res 21:3385-90
Daniels, D L; Plunkett 3rd, G; Burland, V et al. (1992) Analysis of the Escherichia coli genome: DNA sequence of the region from 84.5 to 86.5 minutes. Science 257:771-8