The proteome can be viewed as constellations of interacting protein modules, which are organized into organelles, molecular machines, and signal transduction networks. To date, the interaction partners for only a small fraction of expressed proteins are known, yet protein-protein associations are a major driving factor in understanding protein function and activity. During the last 2 years, we have implemented a high-throughput protein interaction pipeline for analysis of protein interactions for ~10,000 human proteins in 293T cells tagged at their C-terminus with an HA tag with joint support by NHGRI and Biogen-Idec. We refer to this as a first- pass interactome analysis. This has led so far to the analysis of ~5900 proteins (Sept. 20th, 2014) by AP-MS, 2594 of which have already been deposited in the BioGRID database based on their inclusion in a first paper describing the pipeline. Additional interactions are being released quarterly and all are free to the public. The development of the infrastructure required to perform a project of this scale is unprecedented in the field of interaction proteomics. Indeed, we have developed numerous new informatic and operational elements into a pipeline that allows a small team of scientists to perform an analysis of 500-600 baits by AP-MS per month. Numerous quality control measures have been developed, and comparison of the interaction partners of the first 2594 bait proteins with existing protein interaction databases reveals extensive overlap with the highest quality CORUM database, while also providing experimental evidence for over 20,000 novel interactions that have not been previously reported. The network identifies more than 250 clusters of proteins, and includes both known and novel complexes. In this proposal, we seek to continue the production of this pipeline with 3 major goals, which were chosen as priority objectives based on discussions with NHGRI.
In Specific Aim 1, we will complete the analysis of the remaining ORF clones that are available (~4000 clones). We will also re- visit 1,000 ORFs that failed to generate stable cell lines in the first pass analysis, and will examine 1,000 membrane proteins using a different detergent that more readily solubilizes certain classes of membrane proteins than the detergent used in the first-pass analysis.
In Specific Aim 2, we will test the cell-type specificity of interactions for the first 2600 baits in a second cell line (HCT116), thereby providing validation for a subset of interactions and revealing candidate cell-type-specific interactions.
In Specific Aim 3, we will employ a Tandem Mass Tagging approach to compare interactions for ~50 important disease genes and several mutant alleles for each, thereby implementing a highly comparative platform for understanding how disease mutations alter signaling networks, molecular machines, and pathways. All data will be disseminated through public databases and will be subject to extensive bioinformatics and network analysis.
Discovering a protein's function is important for understanding its role in both normal and abnormal biology, including human disease. A critical step in elucidating a protein's function and regulation is determining its interaction partners. Thus, a comprehensive human interactome map would be expected to significantly impact research across potentially all disciplines of human health.
Showing the most recent 10 out of 11 publications