Advances in sequencing technologies provide rapidly increasing amounts of data on human genetic variation. However, distinguishing between neutral variants (with little or no effect on phenotype) from variants conferring disease risk remains a major challenge for both monogenic (Mendelian) and complex diseases. The current state-of-the-art methods for diagnosing amino acid variants primarily employ evolutionary information obtained from multispecies sequence analysis in a variety of ways. While these methods have been used extensively, they often fail to correctly diagnose damaging variants at evolutionarily variable positions and neutral variants at highly conserved positions. Our initial investigations suggests that the protein structural dynamics, which is crucial for proper biochemical activity, has the potential to improve prediction of function-altering variants at less conserved positions and neutral variants at highly conserved positions. Therefore, we propose to explore and build novel in silico prediction tools that exclusively use parameters capturing protein structure and dynamics. We propose to investigate the use of various structure dynamics features that capture the multi- dimensional effects of perturbations on a residue when the protein structure is displaced out of equilibrium. We will also independently assess the contributions of different structural dynamics features in a systematic, quantitative way for their diagnostic power and compare the accuracy of our models with state-of-the-art methods. Furthermore, we will explore the use of multiple methods together to identify most reliable diagnoses. Success of this project will catalyze research at the interface of protein structural biology, molecular genetics, evolution and medicine, as it will advance the mechanistic understanding of protein function disruption in functional and genomic investigations.
Affordable sequencing technologies are quickly revealing single nucleotide variants (nsSNVs) in personal exomes, many of which have the potential to disrupt protein function and modulate individual phenotypes. We plan to explore the development of more accurate predictive computational methods by integrating protein structural dynamics with functional biological knowledge. This will lead to a better diagnosis and mechanistic understanding of the structural features of sequence variations implicated in human health.