Drosophila has a rich history as a model system and many fundamental insights in biology were first made in this organism. Importantly, findings made in the fly can impact human health by revealing conserved protein functions and pathways relevant to human development and a large number of human diseases. Although the Drosophila genome is arguably the best-annotated multi-cellular eukaryotic genome, there remains much to be learned about the functions and interactivity of Drosophila proteins. Moreover, an ultimate goal of biology is not just to understand individual gene or protein activities but also to develop quantitative and globa models that can describe biological systems. To gain such a system-wide understanding of Drosophila requires the application of several large-scale approaches, including proteome-scale analyses of protein complexes and binary protein interactions. Although proteomics approaches show promise in revealing the complexity and structure of functional networks, it is increasingly clear that to date, these studies have sampled only a small fraction of the real interactions that occur. Here, we propose to perform a state-of-the-art, high-throughput, highly quality-controlled binary interaction screen for high-confidence identification of binary protein interactions among 10,000 Drosophila open reading frames (ORFs). We will use our established pipeline to further test positive and random reference sets in order to establish optimal vectors and conditions, followed by multiple rounds of large-scale binary interaction screening and validation. All stages of the process will rely on a high degree of quality control, digital tracking and automation. Integration of the binary screen results with other datasets will be used to generate a high-confidence "interactome" and help guide choices for further molecular genetic and other investigations. Our groups, the Berkeley Drosophila Genome Project (BDGP), Center for Cancer Systems Biology (CCSB) and Drosophila RNAi Screening Center (DRSC) include experts and innovators in the type of large-scale studies and analyses we propose. Importantly, our groups are also committed to making the resulting high-quality materials resources and data rapidly available to the widest possible community. Given the high degree of conservation between Drosophila and other species, the results of this large-scale, high-confidence binary interaction analysis and data integration will have a significant impact on our overall general understanding of how proteins interact to orchestrate complex cellular functions, as well as how perturbation of protein networks leads to specific phenotypes and diseases, leading to the development of a large number of new testable hypotheses.

Public Health Relevance

We propose to perform a state-of-the-art, high-throughput, quality-controlled analysis of binary protein interactions in Drosophila. The resulting next-generation interactome will provide a much more complete picture of possible protein interactions in this model system. This resource will have a major impact on the generation of new hypotheses regarding individual protein function, mining of existing data sets such as those generated from genome-wide RNAi screens, and the analysis of protein networks.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Feingold, Elise A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Harvard University
Schools of Medicine
United States
Zip Code