Predicting disease phenotypes from genotypes is a grand challenge in biology and personalized medicine. Our long-term goal is to address this challenge using a combination of computational and experimental approaches. Working towards this goal, we have developed and deployed a powerful evolutionary systems approach to map the complex relationships connecting sequence, structure, function, regulation and disease in biomedically important protein super-families such as protein kinases. We have made important contributions describing the unique modes of allosteric regulation in various protein kinases, deciphering the structural basis of oncogenic activation in a subset of receptor tyrosine kinases, uncovering the regulation of pseudokinases, and developing new tools and resources for addressing data integration challenges in the signaling field. We propose to build on these impactful studies to answer key questions emanating from our ongoing studies such as: What are the functions of pseudokinases, the catalytically-inert members of the kinome, and how can we use pseudokinases to better predict and characterize non-catalytic functions of kinases? What are the functions of conserved cysteine residues in regulatory sites of protein and small molecule kinases and are they post-translationally modified in redox signaling and oxidative stress response that are causally associated with age-related disorders? How can we enhance existing computational models for predicting genome-phenome relationships using structural information, and can machine learning on structurally enhanced knowledge graphs reveal new relationships between patient-derived mutations and disease phenotypes? We propose to answer these questions using a variety of approaches including statistical mining of large sequence datasets, molecular dynamics simulations, machine learning, mass spectrometry, biochemical analysis and in vivo assays. Completion of this work is expected to reveal new allosteric sites for targeting pseudokinase and kinase non-catalytic functions in diseases, and significantly advance our understanding of kinase regulatory mechanisms in disease and normal states. Our work will create new tools and resources for knowledge graph mining and provide explainable models for inferring causal relationships linking genomes and phenomes with potential applications in personalized medicine. Finally, the scope and impact of our work will be significantly broadened by participation in studies extending our specialized tools and technological approaches developed for the study of kinases to other biomedically important gene families such as glycosyltransferases and sulfotransferases.
Many human diseases including cancer, diabetes, and inflammatory disorders are causally associated with abnormal protein kinase and glycosyltransferase functions. By characterizing the regulatory functions of these proteins in disease and normal states and by developing new tools to predict disease phenotypes from genotypes, the proposed studies will accelerate the targeting of these proteins for drug discovery and personalized medicine.