The completion of a high-quality human genome sequence, as well as the anticipated completion of a high-quality mouse genome sequence, provide a unique opportunity to develop genome-wide functional analyses of mammalian origins of replication. In this application, we propose to develop strategies for cloning and sequencing relatively complete libraries of origins, with the long-range goal of uncovering common sequence elements, and understanding how origins are distributed in the genome vis-a-vis other functional elements. Because so few origins have been uncovered by traditional, labor-intensive approaches, it has not been possible to address these issues.
Specific aims of the proposal are as follows.
Aim 1. To prepare representative origin libraries from a model Chinese hamster ovary cell line that is easily synchronized and which has amplified the dihydrofolate reductase origin of replication approximately 1,000 times. Origin-centered nascent DNA will be labeled in vitro in the first few minutes of the S-period with BrdU and biotinylated-dATP, the nascent DNA will be extruded and purified by affinity chromatography, and the 1,000 bp fraction will be excised from a sizing gel and cloned. An informative 2-D gel replicon mapping approach will be used to validate the purity of the library and guide the purification scheme. The sequences of several thousand clones will then be mapped onto the routine genome to determine the locations relative to genes.
Aim 2. To prepare an early-firing origin library from human Chr 1 by using strategies developed in Aim 1 on a CHO/human hybrid cell line. Several thousand clones from the resulting library will be sequenced and those of human derivation will be mapped onto the human genome.
Aim 3. To prepare comprehensive origin libraries from log-phase human cells, which are difficult to synchronize. To decrease the background contribution from small, non-origin, DNA, which is more significant in unsynchronized cells, nuclei from the in vitro reactions will be encapsulated in agarose beads and purification of the small labeled origin-containing DNA will be carried out in the beads. Alternatively, after the in vitro reactions, replication intermediates will be stabilized and enriched by utilizing their affinity for the nuclear matrix.
Aim 4. To identify the most common sequence motifs contained in validated origins using computational approaches. Sequences from 80 percent of validated origin clones will constitute a training set, which will be examined using similarity-, motif-, composition-, and higher-order-structure-based approaches. The predictive value of the computational approach will be tested on the remaining 20 percent of validated origins (the test set).