A gap exists as to how to interpret the information in the enormous number of sequenced human exomes in terms of the functional consequences of the observed variations in amino acids and their connection to human diseases. This gap also underlies the failure to develop drugs, without side effects, to treat these diseases. This failureis exacerbated by the fact that a given drug molecule binds to different proteins involved in numerous cellular processes. This proposal lays out the details as to how and why these problems occur, and in the context of protein structure, how our existing and proposed progress can help surmount them. We first elucidate the design principles underlying protein structure and function and then apply them to repurpose FDA approved drugs to treat Mendelian diseases and to identify the genetic variations underlying such diseases. We begin by examining whether the stereo chemical space of small molecule drugs and endogenous metabolites is complete and also the differences in the properties of drugs and metabolites. From these analyses, we will suggest how binding specificity might emerge from a highly promiscuous background. This might enable the design of better drugs with minimal side effects and a better understanding of how cells work. Employing these insights, we then develop better structure-based approaches to virtual ligand screening and enzyme function inference. The ability to predict enzymatic function is particularly essential as residue mutations associated wit loss of enzymatic function are the most important missense mutations associated with Mendelian disease. These approaches will use the conservation of ligand-protein microenvironments in stereochemically similar ligand binding sites or active sites in different proteins, regardless of their evolutionary relationship. We will explore the biochemical consequences of a class of enzymes that we discovered - dizymes, single domain proteins that perform two different enzymatic activities at two different active sites. For representative cases, we will experimentally test our predictions of ligand binding and enzymatic activity and their influence on cellular biochemical function. All developed tools will be combined in a comprehensive exome annotation approach. First, it will identify disease associated residue variations. Then, it will predict diseases a protein might be associated with and suggest the best protein targets. Finally, it will suggest what might be the best drugs to treat the disease.
In principle, the information provided by the ~228,000 human exomes sequenced this year could provide tremendous insights into personalized disease diagnosis and treatment. However, this potential is often unrealized as many genetic variations in an exome are of unknown significance and small molecule therapies to redress the functional effects of the associated disease(s) are unknown. This project will develop the tools needed for a comprehensive approach to exome annotation that will suggest which variations might have significant disease association, what diseases they might cause, and what repurposed drugs might assist in the treatment of the identified disease(s).
Showing the most recent 10 out of 16 publications