Large-scale protein-protein interaction assays are widely useful in studies of protein interaction networks, drug activity, and protein engineering. The protein-fragment complementation (PCA) assay screens for in vivo protein-protein interactions (PPIs) in Saccharomyces cerevisiae by converting the strength of an interaction to a relative fitness. For each PPI, protein-fragment-containing haploid yeast pairs are mated and the resulting diploids are tested one-at-a-time. We have developed three technologies that will make PCA more repeatable and higher throughput: 1) a random DNA barcode system that allows us to construct millions of uniquely barcoded yeast strains, 2) a double barcode system that translocates two barcodes on homologous chromosomes to close proximity on the same chromosome, and 3) a pooled fitness assay that allows for accurate measurements of the relative fitness of millions of barcoded genotypes simultaneously via next- generation sequencing. We propose to combine PCA with the above three innovations to generate a massively parallel protein-protein interaction sequencing platform (PPiSeq). Random barcodes are inserted into yeast and are mated to existing PCA strains. Mating of barcoded haploid PCA pools and translocation of barcodes in vivo and en masse yields diploid PCA strains, each with a double barcode representing a specific PPI. Growth of cell pools and sequencing of double barcodes yields an accurate fitness measurement of each double barcode in the pool, which can be translated to an interaction score for each pairwise protein combination. PPiSeq will have have several significant advantages over traditional PCA: it is fast, cheap, highly scalable, and has a low barrier to entry. Notably, PPiSeq throughput scales quadratically with the number of PCA strains, while its costs decline at the rate of next generation sequencing costs. Here, we will construct a large diploid PPiSeq library consisting of ~655,000 unique PPIs and ~6 million double barcodes and use this library to measure the interaction score of each PPI simultaneously via pooled growth (AIM 1). One additional major advantage of PPiSeq over traditional PCA is repeatability across perturbations. That is, once constructed, all pairwise interactions can be re-assayed in new environments easily. Here, we will assay how the yeast protein interactome changes in a heat gradient (AIM 2) and in the presence of three antifungal drugs (AIM 3). This work will provide the first genome-scale view of how the protein interactome changes across perturbations. The constructed PPiSeq library will be provided as a resource to the scientific community for future perturbation studies. Additionally, this work will set the stag for future large-scale PPI screens, such as those involved in drug discovery and protein engineering.

Public Health Relevance

Most drugs target a protein and inhibit it from performing its normal function via interactions with other proteins. Thus, large-scale studies of protein-protein interactions are necessary to identify new drugs, reduce off-target effects of existing drugs, and discover how drug resistance develops. To aid in these efforts, we are developing a high-throughput assay that allows one to cheaply measure the strength of millions of protein- protein interactions simultaneously.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
3R01HG008354-02S1
Application #
9288060
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Gilchrist, Daniel A
Project Start
2015-09-16
Project End
2019-06-30
Budget Start
2016-09-26
Budget End
2017-06-30
Support Year
2
Fiscal Year
2016
Total Cost
$59,914
Indirect Cost
$21,994
Name
State University New York Stony Brook
Department
Type
Organized Research Units
DUNS #
804878247
City
Stony Brook
State
NY
Country
United States
Zip Code
11794
Li, Fangfei; Salit, Marc L; Levy, Sasha F (2018) Unbiased Fitness Estimation of Pooled Barcode or Amplicon Sequencing Studies. Cell Syst 7:521-525.e4
Zhao, Lu; Liu, Zhimin; Levy, Sasha F et al. (2018) Bartender: a fast and accurate clustering algorithm to count barcode reads. Bioinformatics 34:739-747
Frumkin, Idan; Schirman, Dvir; Rotman, Aviv et al. (2017) Gene Architectures that Minimize Cost of Gene Expression. Mol Cell 65:142-153
Smith, Justin D; Schlecht, Ulrich; Xu, Weihong et al. (2017) A method for high-throughput production of sequence-verified DNA libraries and strain collections. Mol Syst Biol 13:913
Schlecht, Ulrich; Liu, Zhimin; Blundell, Jamie R et al. (2017) A scalable double-barcode sequencing platform for characterization of dynamic protein-protein interactions. Nat Commun 8:15586
Jaffe, Mia; Sherlock, Gavin; Levy, Sasha F (2017) iSeq: A New Double-Barcode Method for Detecting Dynamic Genetic Interactions in Yeast. G3 (Bethesda) 7:143-153