The University of California, Riverside is awarded a grant to develop ChemMineTools, an environment to efficiently analyze and model large sets of small molecules along with their bioactivity data. It will provide unrestricted access to a scalable set of open source tools that integrates novel and existing algorithm. To maximize its utility spectrum for experimental and computational scientists, the analysis modules will be available from the powerful R environment, as well as an intuitive-to-use web interface. The specific objectives of this project are: (1) The development and implementation of accelerated compound search and clustering algorithms that scale to today's large databases with millions entries. This will focus on the expansion of the ultrafast EI-Search and EI-Clustering algorithms by embedding and indexing (EI). These multipurpose algorithms will be adopted to advanced similarity measures that can currently not be used for processing large databases due their insufficient speed performance. (2) The R package ChemMine R Tools will be developed. It will offer access to advanced clustering, machine learning and visualization functionalities along with interactive visualization tools. (3) User-friendly access to all analysis and visualization tools will be provided by the ChemMine Web Tools interface. (4) An educational outreach program will be offered to provide extensive training opportunities and to integrate underrepresented groups into this project.

By integrating novel and existing analysis routines in an efficient data mining environment, the project will disseminate and transform multidisciplinary concepts of powerful chemical approaches in modern biology. Moreover, it will encourage young scientists to incorporate cheminformatics strategies into their daily research. Substantial educational resources for interdisciplinary training at the intersect of computational biology and cheminformatics will be provided by this project. Workshops will be offered to scientists, postdoctoral researchers, graduate, undergraduate, high school and other K-12 students. Members of underrepresented groups will participate in all aspects of this project. Extensive online workshop and software manuals will be provided to maximize the educational outreach of the activities. Further information about this project may be found at its website: http://cmtools.ucr.edu.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Application #
0957099
Program Officer
Peter H. McCartney
Project Start
Project End
Budget Start
2010-05-15
Budget End
2014-04-30
Support Year
Fiscal Year
2009
Total Cost
$601,032
Indirect Cost
Name
University of California Riverside
Department
Type
DUNS #
City
Riverside
State
CA
Country
United States
Zip Code
92521