Annotating effects of disease mutations on proteins and protein interactions

Panchenko, Anna

Abstract

The next-generation sequencing technologies have now brought the dream of individual genome identification close to reality. However, new advances in genome sequencing are necessary but not sufficient for identifying functionally important variants and understanding the origins of many diseases. Specific human phenotype is largely determined by stability, activity, and interactions between numerous biomolecules which work together to provide specific cellular functions. Although the majority of genetic variations are likely to be neutral, a substantial fraction of them might explain the origins of Mendelian and complex diseases. Somatic mutations may contribute significantly to tumorigenesis, and driver mutations may allow cancer cells to sustain proliferative signaling. However, finding functionally important mutations and predicting their molecular mechanisms largely remains an unsolved problem. Many diseases are caused by the protein malfunctions whereas missense mutations can render proteins nonfunctional and may be responsible for many diseases. From the clinical perspective, these non-neutral mutations affecting human health represent the main interest. For some diseases and genes, particularly following the Mendelian inheritance patterns, the causal genotype-phenotype relationship has been already established, while for complex polygenic diseases involving multiple factors it is still unknown. Moreover, genetic variants with low penetrance, weakly associated with disease phenotypes, can only be annotated for large samples and for many diseases their genetic determinants have to be discovered. Signaling networks involve a dense network of protein interactions and are often deregulated in many diseases including cancer. Therefore the analysis of protein complexes, disease-related interaction networks and the effects of disease mutations on network properties would give us important clues for understanding the molecular mechanisms of diseases and allow their treatment and prevention. In fact, many disease mutations are located on protein binding interfaces and may affect the specificity of recognition and protein binding affinity. A missense mutation that alters protein binding affinity may cause significant perturbations or complete abolishment of the function, potentially leading to diseases. The availability of computational methods to evaluate the impact of mutations on protein-protein binding is critical for a wide range of biomedical applications. There exists a persistent need to develop a mechanistic understanding of impacts of variants on proteins. To address this need we introduce a new computational method MutaBind to evaluate the effects of sequence variants and disease mutations on protein interactions and calculate the quantitative changes in binding affinity. The MutaBind method uses molecular mechanics force fields, statistical potentials and fast side-chain optimization algorithms. The MutaBind maps mutations on a structural protein complex, calculates the associated changes in binding affinity, determines the deleterious effect of a mutation, estimates the confidence of this prediction and produces a mutant structural model for download. The monomeric Casitas B-lineage lymphoma RING E3 ligases are emerging therapeutic targets in cancer treatment. RING domain of Cbl has E3 ligase activity and ubiquitinylates activated receptor tyrosine kinases. CBLs can bind to the ubiquitin-conjugating enzyme (E2) in a complex with ubiquitin and substrate protein and facilitate the transfer of Ub from E2 to a lysine residue of the substrate. Cbl represents a convenient system to investigate the mechanistic aspects of cancer mutations since several Cbl structures are available representing the snapshots of different stages of Cbl activation cycle. We applied our previously developed computational method to predict the effects of Cbl mutations on different Cbl states. Our predictions were tested by performing blind experiments on Cbl-mediated EGFR ubiquitination assays and showed a remarkable agreement between experimental densitometry data and changes in Cbl stability. We showed both experimentally and computationally that about one third of tested cancer mutations were probably passengers and did not impact the ubiquitination activity while others, potentially driver mutations, affected different stages of Cbl activation cycle either completely abolishing its ligase activity or partially attenuating it. The evolution of cancer is driven by somatic mutations and clonal selection of these mutations. It is therefore important to decouple mutagenesis from selection in order to characterize driving events in tumor evolution. Mutational process can be affected by local DNA sequence context around the mutated site. We developed a new computational method, MutaGene, that allows to estimate and analyze context-dependent mutational signatures to calculate the expected background mutabilities of nucleotide and amino acids in DNA and protein positions, connecting processes operating at DNA level to protein phenotype. Mutability can be used for identification of cancer driver mutations thereby linking cancer genotype with phenotype and decoupling relative contributions of mutagenesis and selection in carcinogenesis. Nucleosomes represent elementary building blocks of chromatin and unique systems to study protein-protein, protein-DNA binding and principles of their regulation. There are four types of core histones (H3, H4, H2A, H2B), two copies of each forming the nucleosome core particle. Long N-terminal histone tails protrude from the octamer and have many post-translational modification sites, which constitute the so-called histone code. Basic histone types are known to be encoded by a set of genes which give rise to a family of histone variants that can all be incorporated into nucleosomes and may have functional and structural significance. It was shown that histone variants can be implicated in many important biological processes including transcription regulation, DNA repair, heterochromatin formation, chromosome segregation and mitosis. We developed a new version of HistoneDB database, 'HistoneDB 2.0--with variants', that is a comprehensive database of histone protein sequences, classified by histone types and variants. All entries in the database are supplemented by rich sequence and structural annotations with many interactive tools to explore and compare sequences of different variants from various organisms. The determination of nucleosome core particle structure by X-ray crystallography has been a giant leap forward in our understanding of chromatin structure, however, it is now becoming more evident that the obtained crystal structure does not represent the only possible functionally important conformation. Molecular dynamics simulations of nucleosomes allow to go beyond crystallography in our understanding of nucleosomes by providing a framework to analyze the interactions through a dynamic conformational ensemble, by understanding the behavior of histone tails and by studying the variations in histone sequences and their dynamical behavior. Our goal is to understand the influence of histone sequence variation on the structure and dynamics of nucleosomes and identify key determinants, which affect nucleosome function. In our approach we integrate molecular modeling, bioinformatics sequence analysis together with experimental data in order to bridge the gap between these methods. As a result we identified and characterized the rearrangements in nucleosomes on a microsecond timescale including the coupling between the conformation of the histone tails and the DNA geometry. We found that certain histone tail conformations promoted DNA bulging near its entry/exit sites, resulting in the formation of twist defects within the DN

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Investigator-Initiated Intramural Research Projects (ZIA)
Project #: 1ZIALM090313-04
Application #: 9352652
Study Section

Project Start
Project End
Budget Start
Budget End
Support Year: 4
Fiscal Year: 2016
Total Cost
Indirect Cost

Institution

Name: National Library of Medicine
Department
Type
DUNS #

City
State
Country
Zip Code

Related projects


NIH 2018 ZIA LM	In silico Modeling and Validation to Identify Molecular Mechanisms of Cancer Panchenko, Anna / National Library of Medicine
NIH 2017 ZIA LM	In silico Modeling and Validation to Identify Molecular Mechanisms of Cancer Panchenko, Anna / National Library of Medicine
NIH 2016 ZIA LM	Annotating effects of disease mutations on proteins and protein interactions Panchenko, Anna / National Library of Medicine
NIH 2015 ZIA LM	Annotating effects of disease mutations on proteins and protein interactions Panchenko, Anna / National Library of Medicine
NIH 2014 ZIA LM	Protein-protein binding: regulation and effect of disease mutations Panchenko, Anna / National Library of Medicine
NIH 2013 ZIA LM	Protein-protein binding: regulation and effect of disease mutations Panchenko, Anna / National Library of Medicine	$1,115,664

Publications

Zhao, Feiyang; Zheng, Lei; Goncearenco, Alexander et al. (2018) Computational Approaches to Prioritize Cancer Driver Missense Mutations. Int J Mol Sci 19:

NCBI Resource Coordinators (2018) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 46:D8-D13

Rogozin, Igor B; Goncearenco, Alexander; Lada, Artem G et al. (2018) DNA polymerase ? mutational signatures are found in a variety of different types of cancer. Cell Cycle 17:348-355

Goncearenco, Alexander; Rager, Stephanie L; Li, Minghui et al. (2017) Exploring background mutational processes to decipher cancer genetic heterogeneity. Nucleic Acids Res :

Xiao, Hua; Wang, Feng; Wisniewski, Jan et al. (2017) Molecular basis of CENP-C association with the CENP-A nucleosome at yeast centromeres. Genes Dev 31:1958-1972

Goncearenco, Alexander; Li, Minghui; Simonetti, Franco L et al. (2017) Exploring Protein-Protein Interactions as Drug Targets for Anti-cancer Therapy with In Silico Workflows. Methods Mol Biol 1647:221-236

El Kennani, Sara; Adrait, Annie; Shaytan, Alexey K et al. (2017) MS_HistoneDB, a manually curated resource for proteomic analysis of human and mouse histones. Epigenetics Chromatin 10:2

Li, Minghui; Goncearenco, Alexander; Panchenko, Anna R (2017) Annotating Mutational Effects on Proteins and Protein Interactions: Designing Novel and Revisiting Existing Protocols. Methods Mol Biol 1550:235-260

Rogozin, Igor B; Pavlov, Youri I; Goncearenco, Alexander et al. (2017) Mutational signatures and mutable motifs in cancer genomes. Brief Bioinform :

Shaytan, Alexey K; Xiao, Hua; Armeev, Grigoriy A et al. (2017) Hydroxyl-radical footprinting combined with molecular modeling identifies unique features of DNA conformation and nucleosome positioning. Nucleic Acids Res 45:9229-9243

Showing the most recent 10 out of 33 publications

Comments

Be the first to comment on Anna Panchenko's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: