With support from the Division of Chemistry (CHE) and the Division of Materials Research (DMR) in the Directorate for Mathematical and Physical Sciences (MPS) and the Directorate for Computer & Information Science & Engineering (CISE), this project aims to advance the use of modern data science in chemistry. In particular, the project will advance the field of data-driven chemical research by promoting the use of machine learning and other data mining techniques in the molecular sciences and by fostering and coalescing a community of stakeholders. The work of the project and the community it represents aims to transform chemistry's ability to tackle challenging discovery and design problems. This approach can dramatically accelerate and streamline the process that leads to chemical innovation -- an important factor in economic and technological advancement -- and thus result in an improved return on public and private investments. The project also addresses corresponding questions of training and workforce development needed in chemistry, thus insuring the US's international competitiveness.

The mission of this project is to assert the role of big data research in the chemical domain, i.e., to promote, enable, and advance the ideas of data-driven discovery and rational design. The project aims to create a community-driven roadmap as well as facilitate concrete solutions that are beyond the scope of the disjointed efforts of its individual stakeholders. The Big Data Hubs and Spokes ecosystem is the ideal framework to realizing this vision and accelerating progress in this high-priority area of research. The effort at hand sets out to implement some of the key findings of the recent NSF Division of Chemistry workshop on Framing the Role of Big Data and Modern Data Science in Chemistry. The four signature initiatives of this Spoke project include (i) the planning, coordination, integration, and consolidation of community-developed software tools for big data research in chemistry as well as the formulation of guidelines, best practices, and standards; (ii) the organization of workshops for community building, to connect solution seekers with solution providers, and to address questions ranging from strategic to technical; and (iii) the creation and dissemination of community-developed teaching materials as well as the formulation of course, program, and curricular recommendations for education and workforce development that reflect the changing, data-centric approach in chemical research; (iv) providing access to a shared hardware infrastructure for community data sets, on-site data mining capacity, and the exploration of domain specific method and hardware issues.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2018-09-01
Budget End
2021-08-31
Support Year
Fiscal Year
2017
Total Cost
$700,000
Indirect Cost
Name
Suny at Buffalo
Department
Type
DUNS #
City
Buffalo
State
NY
Country
United States
Zip Code
14228