Quantifying molecular consequences of human missense variants with large-scale interactome perturbation studies

Alexov, Emil; Clark, Andrew; Yu, Haiyuan

Abstract

Almost all proteins function through interacting with other proteins. Previous studies have shown that the vast majority of damaging single amino acid mutations in proteins disrupt only a subset of specific protein-protein interactions, and that mutations in the same protein that disrupt different interactions tend to cause clinically distinct disorders. Therefore, it is of great importance to determine interaction-specific disruptions caused by each mutation. Furthermore, rapid advances in sequencing technologies have enabled the identification of tens of millions of single nucleotide variants (SNVs) in the human population, driving an urgent need to understand the impact of each SNV on the human interactome network. Unfortunately, there is currently no method that is capable of predicting the specific impact of a large fraction of these SNVs on individual protein-protein interactions. To address this issue, we propose to leverage our massively-parallel site-directed mutagenesis pipeline, Clone-seq, to generate clones for ~6,000 coding SNVs in the human population: ~4,000 from gnomAD and ~2,000 to be submitted by the international human genetics community. We will then experimentally examine the impact on protein stability and individual protein-protein interactions for every variant using high-throughput DUAL-FLUO and InPOINT (integrating PCA, LUMIER, Y2H, and wNAPPA) assays. This proposal brings together three groups with complementary expertise in high-throughput interactome experiments and network analysis from the Yu lab, in genomic and population genetic studies from the Clark lab, and in comprehensive biophysical and structural modeling of mutation?s impact on binding free energy of protein interactions from the Alexov lab. Out of the ~6,000 SNVs, we expect to identify ~1,200 disruptive SNVs and ~4,000 different SNV-interaction pairs where the SNV disrupt that specific interaction. The data produced by our project will increase the available experimental information by >140 in number of human proteins and >500 in number of interactions, allowing us for the first time to comprehensively assess the relationships between the impact of SNVs on interactions and their various population genetic attributes (including, but not limited to, allele frequency and flanking haplotype, inter-population differentiation, local rate of recombination, allele age, modes of selection). Finally, we will establish a computational-experimental- integrated iterative learning scheme to build a multi-layer random-forest-based framework, SIMPACT, which can accurately predict specific impacts on all individual protein-protein interactions for all missense SNVs. Our proposed work will fuel hypothesis-driven research, will significantly improve our functional understanding of variants, and will likely fundamentally change the experimental design and data interpretation for whole genome/exome studies going forward.

Public Health Relevance

The dramatic increase of DNA variants discovered through advances in sequencing technologies has been inadequately translated into therapeutic successes. Although many of these variants are related to human disorders, the overwhelming number of non-functional variants makes the assessment of functional significance a steep challenge. In this study, we aim to develop a high-throughput pipeline to quickly clone and directly test a large number of coding variants for their impact on the human interactome network and use the results to build a machine learning pipeline to predict functional impact of all coding variants, in anticipation that both our experimental data and computational pipeline will lead to broad clinical and therapeutic applications.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM125639-03
Application #: 9872026
Study Section: Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer: Krasnewich, Donna M

Project Start: 2018-01-23
Project End: 2021-12-31
Budget Start: 2020-01-01
Budget End: 2020-12-31
Support Year: 3
Fiscal Year: 2020
Total Cost
Indirect Cost

Institution

Name: Cornell University
Department: Miscellaneous
Type: Organized Research Units
DUNS #: 872612445

City: Ithaca
State: NY
Country: United States
Zip Code: 14850

Related projects


NIH 2021 R01 GM	Quantifying molecular consequences of human missense variants with large-scale interactome perturbation studies Alexov, Emil Georgiev; Clark, Andrew G.; Yu, Haiyuan / Cornell University
NIH 2020 R01 GM	Quantifying molecular consequences of human missense variants with large-scale interactome perturbation studies Alexov, Emil Georgiev; Clark, Andrew G.; Yu, Haiyuan / Cornell University
NIH 2019 R01 GM	Quantifying molecular consequences of human missense variants with large-scale interactome perturbation studies Alexov, Emil Georgiev; Clark, Andrew G.; Yu, Haiyuan / Cornell University
NIH 2019 R01 GM	Quantifying molecular consequences of human missense variants with large-scale interactome perturbation studies Alexov, Emil Georgiev; Clark, Andrew G.; Yu, Haiyuan / Cornell University
NIH 2018 R01 GM	Quantifying molecular consequences of human missense variants with large-scale interactome perturbation studies Alexov, Emil Georgiev; Clark, Andrew G.; Yu, Haiyuan / Cornell University
NIH 2018 R01 GM	Quantifying molecular consequences of human missense variants with large-scale interactome perturbation studies Alexov, Emil Georgiev; Clark, Andrew G.; Yu, Haiyuan / Cornell University

Publications

Chen, Siwei; Fragoza, Robert; Klei, Lambertus et al. (2018) An interactome perturbation framework prioritizes damaging missense mutations for developmental disorders. Nat Genet 50:1032-1040

Comments

Be the first to comment on Emil Alexov's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: