After the completion of genomic sequences of Drosophila and other eukaryotes, the next major challenge in genomics is the identification of complete gene and protein sets for each organism. A number of biological realities, such as differentially spliced or terminated genes, complicate the interpretation of the genomic sequences, and these difficulties are unlikely to be overcome by computational solutions alone. At Berkeley Drosophila Genome Project (BDGP), we have recently finished the sequencing of a large set of expressed sequence tags (ESTs) which has allowed for annotation of functionally expressed portions of the Drosophila genome. The ESTs have given rise to the Drosophila Gene Collection (DGC). We propose to essentially complete this unigene DGC. Further, we aim to identify alternatively spliced genes. Our cDNA and EST collections have already accelerated progress in generating a comprehensive transcript map of all Drosophila genes by providing information on the intron-exon structure, alternative splicing, and transcriptional start and stop sites. Sequences of our EST and Gene Collection will assist in authentication of predicted genes, discovery of unannotated genes, and refinement of existing gene models. We anticipate that these studies will provide information on the relative merits of approaches for completing cDNA gene collections, such as the mammalian gene collection (MGC). Having a representative cDNA for every predicted gene would allow characterization of biologically significant genes expressed at low levels or in only a few cells. Our goals are to obtain a more detailed understanding of the complete set of proteins that are encoded by the Drosophila genome and to provide cDNAs and functional genomics resources to the research community. These studies will provide information and tools that will further our understanding of higher eukaryotes and lay the groundwork for more complete analyses of genomic organization and protein function in Drosophila and other eukaryotes, including humans.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG002673-03
Application #
6887356
Study Section
Special Emphasis Panel (ZRG1-GNM (05))
Program Officer
Feingold, Elise A
Project Start
2003-04-28
Project End
2006-06-30
Budget Start
2005-04-01
Budget End
2006-06-30
Support Year
3
Fiscal Year
2005
Total Cost
$991,897
Indirect Cost
Name
Lawrence Berkeley National Laboratory
Department
Genetics
Type
Organized Research Units
DUNS #
078576738
City
Berkeley
State
CA
Country
United States
Zip Code
94720
Hoskins, Roger A; Carlson, Joseph W; Wan, Kenneth H et al. (2015) The Release 6 reference sequence of the Drosophila melanogaster genome. Genome Res 25:445-58
Noyes, Marcus B; Christensen, Ryan G; Wakabayashi, Atsuya et al. (2008) Analysis of homeodomain specificities allows the family-wide prediction of preferred recognition sites. Cell 133:1277-89
Noyes, Marcus B; Meng, Xiangdong; Wakabayashi, Atsuya et al. (2008) A systematic characterization of factors that regulate Drosophila segmentation via a bacterial one-hybrid system. Nucleic Acids Res 36:2547-60
Stapleton, Mark; Carlson, Joseph W; Celniker, Susan E (2006) RNA editing in Drosophila melanogaster: New targets and functional consequences. RNA 12:1922-32
Wan, Kenneth H; Yu, Charles; George, Reed A et al. (2006) High-throughput plasmid cDNA library screening. Nat Protoc 1:624-32
Chen, Li; Lullo, Dennis J; Ma, Enbo et al. (2005) Identification and analysis of U5 snRNA variants in Drosophila. RNA 11:1473-7
Tupy, Jonathan L; Bailey, Adina M; Dailey, Gina et al. (2005) Identification of putative noncoding polyadenylated transcripts in Drosophila melanogaster. Proc Natl Acad Sci U S A 102:5495-500
Hoskins, Roger A; Stapleton, Mark; George, Reed A et al. (2005) Rapid and efficient cDNA library screening by self-ligation of inverse PCR products (SLIP). Nucleic Acids Res 33:e185