Dramatic advances in our understanding of molecular structure and function promise to accelerate the creation of new diagnostics and therapeutics. However the link between the structure of a biological macromolecule and its function is usually not obvious: fundamental to understanding how a molecule functions is an understanding of how its structure behaves over time. Recent advances in molecular dynamics simulations now allow the rapid collection of information about structural motion. These data sets are huge, and require statistical machine learning algorithms to characterize and recognize patterns relevant to function. The National Library of Medicine's new long-range plan calls for research in the use of advanced simulation and machine learning algorithms in support of biomedical research. This proposal focuses on annotating molecular structures with missing or incomplete functional information. We are particularly interested in identifying binding sites and active sites in proteins. We bring together simulation and machine learning, and hypothesize that the performance of structure- based function annotation methods will dramatically improve with the addition of information about dynamics. Thus, our specific aims are (1) to develop methods for recognizing function from structural dynamics and diversity, (2) to develop capabilities for large scale clustering and analysis tools for the discovery of novel functions, and (3) to apply our tools to challenging and important biological systems, while disseminating our software, data and capabilities to the biomedical research community. In particular, we will focus our new capabilities on three difficult function annotation challenges: ATP binding sites, phosphorylation sites, and metabolizing enzyme active sites.

Public Health Relevance

The explosion in data related to molecular biology has created great opportunities for new disease diagnostics and therapies. One source of data is the three-dimensional (3D) structure of biological molecules such as proteins, DNA and RNA. This work focuses on using computational technologies to understand how these structures perform their function, so we have a better understanding of both normal and disease processes.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Schools of Medicine
United States
Zip Code
Chen, Jonathan H; Podchiyska, Tanya; Altman, Russ B (2016) OrderRex: clinical order decision support and outcome predictions by data-mining electronic medical records. J Am Med Inform Assoc 23:339-48
Chen, Jonathan H; Goldstein, Mary K; Asch, Steven M et al. (2016) DYNAMICALLY EVOLVING CLINICAL PRACTICES AND IMPLICATIONS FOR PREDICTING MEDICAL DECISIONS. Pac Symp Biocomput 21:195-206
Mallory, Emily K; Zhang, Ce; Ré, Christopher et al. (2016) Large-scale extraction of gene interactions from full-text literature using DeepDive. Bioinformatics 32:106-13
Bagley, Steven C; Sirota, Marina; Chen, Richard et al. (2016) Constraints on Biological Mechanism from Disease Comorbidity Using Electronic Medical Records and Database of Genetic Variants. PLoS Comput Biol 12:e1004885
Li, Yong Fuga; Xin, Fuxiao; Altman, Russ B (2016) SEPARATING THE CAUSES AND CONSEQUENCES IN DISEASE TRANSCRIPTOME. Pac Symp Biocomput 21:381-92
Percha, Bethany; Altman, Russ B (2015) Learning the Structure of Biomedical Relationships from Unstructured Text. PLoS Comput Biol 11:e1004216
Chen, Jonathan H; Altman, Russ B (2015) Data-Mining Electronic Medical Records for Clinical Order Recommendations: Wisdom of the Crowd or Tyranny of the Mob? AMIA Jt Summits Transl Sci Proc 2015:435-9
Altman, Russ B (2015) Predicting cancer drug response: advancing the DREAM. Cancer Discov 5:237-8
Zhou, Weizhuang; Tang, Grace W; Altman, Russ B (2015) High Resolution Prediction of Calcium-Binding Sites in 3D Protein Structures Using FEATURE. J Chem Inf Model 55:1663-72
Altman, Russ B; Ashley, Euan A (2015) Using "big data" to dissect clinical heterogeneity. Circulation 131:232-3

Showing the most recent 10 out of 51 publications