The implicit driving force behind the Human Genome Project is to identify the biological functions encoded by the human genome. To fully realize this goal, not only will the sequence of the human genome need to be determined, but this sequence will require extensive an notation with features of biological importance. Due to the difficulties of genetic analysis of human and other vertebrates, and the growing evidence for evolutionary conservation of both individual genes and entire genetic pathways, the genome of Drosophila melanogaster will be used for my studies. The long term goal is to provide information on the function of human genes. My goals for the fellowship period primarily focus on developing a specific strategy for cloning cDNAs containing the 5' end of Drosophila mRNAs and identifying full-length cDNAs derived from RNAs encoded by a genomic region of known sequence. Since the 2Mb Adh region (chromosomal region 34D-36A) will have been completely sequenced and is genetically well characterized, a newly identified cDNA can be easily aligned with the genomic DNA sequence and correlated with the genetic map. In addition, noncoding DNA sequences regulating Drosophila gene expression, such as 5' promoters and introns can be easily identified based on the comparison between cDNA and genomic DNA sequences. Information obtained from such studies will, in turn, serve as a starting point not only for further characterization of the biological function of protein coding regions, but also for the study of noncoding DNA sequences regulating eucaryotic gene expression and differential RNA splicing. Finally, in collaboration with computational biologists within the Center, this extensively annotated sequence will be used as a test bed for evaluating computational approaches to interpret gnomic DNA sequences.