When genomes are sequenced for personalized medicine, each patient can have up to 10,000 varia- tions in their protein sequences. To identify which amino acid changes are medically relevant, many computer algorithms have been developed. In thinking about how to improve these algorithms, we considered that, among their input, many include evolutionary information about the affected proteins. Algorithms also include ?rules? devised from decades of mutation experiments: Similar amino acids allow function (toggle on); other amino acids abolish function or structure (toggle off); each mutation will have the same outcome in any homolog. However, experiments have been very heavily biased to conserved positions. In contrast, >50% of amino acid positions are not conserved during the evolution of most proteins. If nonconserved positions follow different rules, this may be one source for false positive and negative predictions in genome analyses. We are bridging this gap between experimental protein chemistry and computer predictions. In our first study, we used 10 homologs to assess the outcomes for >1000 mutations at nonconserved positions. Strikingly, these positions did not follow any of the substitution rules listed above. First, when multiple amino acids were substituted into one position, they caused a wide range of functional outcomes (?rheostat position?). Second, chemically similar amino acids did not always have similar outcomes. Third, when a given position was substituted in multiple homologs, the same amino acid had different outcomes. Thus, rheostatic nonconserved positions are likely to give false results in current predictions. Preliminary results show that other proteins have rheostat positions. The central hypothesis of this proposal is that rheostat positions have general properties that distinguish them from other nonconserved positions.
In Aim 1, we will test the hypothe- sis that rheostat positions can be detected by a particular pattern of evolutionary change, using pyruvate kinase, aldolase, and an organic anion transmembrane transporter as model systems. If prediction is possible, amino acid variants at rheostat positions should be ? for now ? classified as having ?unknown significance? to reduce false predictions. Further, all experimental results can be used by the CAGI community to assess the development of new algorithms.
In Aim 2, we will use molecular dynamics simulations and hydrogen ex- change experiments to determine how rheostat mutations affect protein motions.
In Aim 3, we will use X-ray crystallography and structural predictions to determine how rheostat mutations affect side-chain packing. The results from Aims 2-3 (i) can be used to identify regions in other proteins that contain rheostat positions, and (ii) will provide the groundwork for formulating new rules for predicting the outcomes of rheostat mutations. The new rules are needed to reach our long-term goals of improving computer predictions and reducing the number of clinical variants with unknown significance.

Public Health Relevance

Personalized medicine: To identify medically-relevant changes in patient proteins, current tools make assum- ptions that are based, in part, on how each protein evolves. However, these assumptions do not apply to >50% of amino acid positions. Our experimental results will lead to improved ?rules? for new computer analyses

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Macromolecular Structure and Function B Study Section (MSFB)
Program Officer
Mcguirl, Michele
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Kansas
Schools of Medicine
Kansas City
United States
Zip Code
Schmit, Jeremy D; Kariyawasam, Nilusha L; Needham, Vince et al. (2018) SLTCAP: A Simple Method for Calculating the Number of Ions Needed for MD Simulation. J Chem Theory Comput 14:1823-1827
Hodges, Abby M; Fenton, Aron W; Dougherty, Larissa L et al. (2018) RheoScale: A tool to aggregate and quantify experimentally determined substitution outcomes for multiple variants at individual protein positions. Hum Mutat 39:1814-1826