The Chemical Abstract Services (CAS) recently recorded the 53 millionth unique chemical substance in the CAS registry with the 40 millionth being cataloged only 9 months prior. With this explosive growth in chemical substances, the question of what physical, chemical, and biological properties are possessed by these substances arises. The existing data for fundamental physical/chemical properties, such as dissociation energies, logP, enthalpies of formation, refractive indexes, boiling points, and melting points define Structure Property Relationships (SPR) and is accessible through SciFinder and Beilstein. Similarly, fundamental biological properties such as binding constants to enzymes are available through the ChEMBL and PubChem databases.

This research creates an innovative tool relying on volunteer computing for predicting and mapping SPRs three-dimensionally using a novel computational algorithm ?PROPMAP? for education and research. The novel architecture of PROPMAP utilizes a Monte Carlo random walk structure generator and Quantitative Structure Property Relationship (QSPR) models based on a graphics processing unit (GPU) accelerated Support Vector Regression algorithm. PROPMAP interactively maps user-selected physical, chemical, and biological properties onto any input chemical structure of interest. Graphical representation fosters understanding of SPRs in basic chemistry education and enables targeted property modifications in research. This novel approach utilizes volunteer GPU computing through the Berkeley Open Infrastructure for Network Computing (BOINC) to overcome the otherwise prohibitive computational expense of training and cross validating models for data sets in excess of one million substances. The impact of PROPMAP on the scientific community is broadened by making the tool freely available through a WWW interface for use as an educational and research tool while also directly training institutions on its utility through workshops and seminars.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Application #
1122919
Program Officer
Alan Sussman
Project Start
Project End
Budget Start
2011-08-01
Budget End
2014-07-31
Support Year
Fiscal Year
2011
Total Cost
$240,000
Indirect Cost
Name
Lowe Edward W
Department
Type
DUNS #
City
Nashville
State
TN
Country
United States
Zip Code
37232