The current status of human physical mapping efforts and automated DNA sequencing have made it feasible to consider beginning to sequence the human genome at a substantial pace. The integrated YAC/cosmid physical map constructed at Los Alamos National Laboratory for chromosome 16 is among the most complete of the human chromosome maps. The 40 Mbp p arm of chromosome 16 presents an excellent starting point to test the effectiveness of converting a physical map at 100 kb resolution into a set of minimally overlapping clones for sequencing and ultimately for obtaining the complete sequence of the chromosome. A nearly complete YAC map is available for chromosome 16 as is an extensive set of cosmids and cosmid contigs anchored to the YAC map by STSs spaced an average of 150 kb apart. The strategy proposed here will be based on identifying BAC, PAC, and other stable, large-insert clones from 16p, selecting a non- overlapping set for initial sequencing, and development of a minimal tiling path of clones for sequencing to reach closure. Throughout the project, technology development will be performed to stress increasing the rate of generation of high quality sequence data and decreasing the cost. The first step will be to identify a set of minimally overlapping or non- overlapping BAC clones that cover human chromosome 16p using the previously established chromosome 16p map. We estimate that approximately 200 such BACs will cover at least 70% of the short arm of 16. These clones will be used immediately for sequencing. This will necessitate development of methods for making shotgun sequencing libraries from BAC clones so that they may be rapidly and efficiently sequenced. Non- overlapping BACs and cosmid clones from large contigs in the existing chromosome 16 map will be sequenced using a random shotgun strategy which produces high quality finished sequence. The great majority of 16p can be sequenced by this approach. Strategies for closure of the map using YAC, BAC, PAC, P1, and cosmid clones and development of a complete sequence- ready clone set for 16p will also be developed and tested. Several enhancements in the software associated with automated sequencing will be made including improvements in sample tracking and data management, sequence assembly, and incorporation of base-calling confidence values in assembly and editing. The result of the technology development and software improvements will be to reduce the cost of sequencing from $0.30 to $0.18 per finished base pair over three years. Accurate, annotated data will be provided to the scientific community within six months of completion of each segment.
Hurowitz, E H; Melnyk, J M; Chen, Y J et al. (2000) Genomic characterization of the human heterotrimeric G protein alpha, beta, and gamma subunit genes. DNA Res 7:111-20 |
Cao, Y; Kang, H L; Xu, X et al. (1999) A 12-Mb complete coverage BAC contig map in human chromosome 16p13.1-p11.2. Genome Res 9:763-74 |
Loftus, B J; Kim, U J; Sneddon, V P et al. (1999) Genome duplications and other features in 12 Mb of DNA sequence from human chromosome 16p and 16q. Genomics 60:295-308 |
Trask, B J; Massa, H; Brand-Arpon, V et al. (1998) Large multi-chromosomal duplications encompass many members of the olfactory receptor gene family in the human genome. Hum Mol Genet 7:2007-20 |