Machine Learning in Chemistry and Biology

Jain, Ajay

Abstract

Machine learning has broad applicability in the fields of chemistry and biology. This research effort is focused on empirical derivation of functions that are useful in the context of predicting aspects of molecular interaction between proteins and ligands. The characteristics of this problem offer unique challenges when approached from the perspective of machine learning, key among them being that the configuration in which molecules interact is not generally known. In the case of small molecule protein interactions, where it is possible to represent molecules as 3D objects, this is manifested in terms of hidden variables in the relative conformation and alignment of protein and ligand. Most machine learning tasks do not embed hidden variables in this fashion, but the problem is not insurmountable. We have implemented a number of methods which demonstrate that the problem of hidden variables is tractable, both methodologically in model induction and scoring function optimization as well as from the perspective of computational complexity in search. In this work, we will develop novel methods and refine existing methods in 3 problem areas: 1) Developing scoring functions for small molecule protein interactions with a known protein structure (the docking problem); 2) Developing quantitative models of small molecule activity against proteins with no known structure (the 3D QSAR problem); and 3) Developing methods for search and optimization that improve both model and scoring function induction and high-throughput application to large libraries of small molecules. The goal is to address the problem of prediction in a quantifiable way, which will allow both practical improvements in applications of the methods, and will also provide insight into the mechanistic aspects of the underlying physical molecular interactions. All methods and data will be made widely available to both academic and industrial investigators.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 1R01GM070481-01A2
Application #: 6965574
Study Section: Special Emphasis Panel (ZRG1-BDMA (01))
Program Officer: Wehrle, Janna P

Project Start: 2005-07-01
Project End: 2009-06-30
Budget Start: 2005-07-01
Budget End: 2006-06-30
Support Year: 1
Fiscal Year: 2005
Total Cost: $279,983
Indirect Cost

Institution

Name: University of California San Francisco
Department: Internal Medicine/Medicine
Type: Schools of Medicine
DUNS #: 094878337

City: San Francisco
State: CA
Country: United States
Zip Code: 94143

Related projects


NIH 2013 R01 GM	Data-Driven Approaches for Molecular Docking Jain, Ajay N. / University of California San Francisco	$295,918
NIH 2012 R01 GM	Data-Driven Approaches for Molecular Docking Jain, Ajay N. / University of California San Francisco	$307,013
NIH 2011 R01 GM	Data-Driven Approaches for Molecular Docking Jain, Ajay N. / University of California San Francisco	$307,363
NIH 2010 R01 GM	Data-Driven Approaches for Molecular Docking Jain, Ajay N. / University of California San Francisco	$306,719
NIH 2009 R01 GM	Machine Learning in Chemistry and Biology Jain, Ajay N. / University of California San Francisco	$235,562
NIH 2008 R01 GM	Machine Learning in Chemistry and Biology Jain, Ajay N. / University of California San Francisco	$265,474
NIH 2007 R01 GM	Machine Learning in Chemistry and Biology Jain, Ajay N. / University of California San Francisco	$265,474
NIH 2006 R01 GM	Machine Learning in Chemistry and Biology Jain, Ajay N. / University of California San Francisco	$273,403
NIH 2005 R01 GM	Machine Learning in Chemistry and Biology Jain, Ajay N. / University of California San Francisco	$279,983

Publications

Cleves, Ann E; Jain, Ajay N (2015) Chemical and protein structural basis for biological crosstalk between PPAR? and COX enzymes. J Comput Aided Mol Des 29:101-12

Cleves, Ann E; Jain, Ajay N (2015) Knowledge-guided docking: accurate prospective prediction of bound configurations of novel ligands using Surflex-Dock. J Comput Aided Mol Des 29:485-509

Yera, Emmanuel R; Cleves, Ann E; Jain, Ajay N (2014) Prediction of off-target drug effects through data fusion. Pac Symp Biocomput :160-71

Spitzer, Russell; Cleves, Ann E; Varela, Rocco et al. (2014) Protein function annotation by local binding site surface similarity. Proteins 82:679-94

Varela, Rocco; Cleves, Ann E; Spitzer, Russell et al. (2013) A structure-guided approach for protein pocket modeling and affinity prediction. J Comput Aided Mol Des 27:917-34

Spitzer, Russell; Jain, Ajay N (2012) Surflex-Dock: Docking benchmarks and real-world application. J Comput Aided Mol Des 26:687-99

Varela, Rocco; Walters, W Patrick; Goldman, Brian B et al. (2012) Iterative refinement of a binding pocket model: active computational steering of lead optimization. J Med Chem 55:8926-42

Jain, Ajay N; Cleves, Ann E (2012) Does your model weigh the same as a duck? J Comput Aided Mol Des 26:57-67

Yera, Emmanuel R; Cleves, Ann E; Jain, Ajay N (2011) Chemical structural novelty: on-targets and off-targets. J Med Chem 54:6771-85

Spitzer, Russell; Cleves, Ann E; Jain, Ajay N (2011) Surface-based protein binding pocket similarity. Proteins 79:2746-63

Showing the most recent 10 out of 25 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: