A working draft of the human genome has been completed. Transcription of RNA is one of the functional processes which permit the transfer of encoded information from the DNA sequence into function. Recent evidence suggests that the catalogue of transcripts that are made for the human genome is more complex than indicated by current annotations. The locations and characteristics of regulatory elements encoded in the genome which control the expression of these transcripts are even less well understood. The goal of this proposal is to describe a collection of generic and unbiased strategies which can be used, on a genome-wide scale, to locate the sites of transcription and the functional elements which regulate RNA expression. These strategies have high density oligonucleotide arrays and chromatin immunoprecipation (CHIP) as core technologies that will enable the mapping of these sites. For the first year of our proposed ENCODE project, 30 Mb of distributed genomic sequence (approximately 1% of genome) selected from 44 different chromosomal locations ranging in size from 500 kb to 2 Mb will serve as target sequences. A single array with approximately 810,000 probe pairs that interrogate the 30 Mb of sequence at every 37 bp, on average, will be synthesized and used as a common platform to map the locations of both transcription and functional regulatory elements. The functional elements which will be monitored are the binding sites for 15 transcription factors and 4 repressors as well as the locations for 7 types of histone modifications which have been correlated with RNA regulation. Three well-characterized phorbol ester or retinoic acid- responsive cell lines (Jurkat, NCCIT and HL-60) will be temporally monitored when activated by these molecules and will provide the biological context for these studies. To demonstrate the scalability of this collection of strategies, the second and third years of our proposed studies will be focused on constructing a similar collection of maps, at a resolution of 35 bp on average, but across the entire human genome.
Showing the most recent 10 out of 22 publications