This INSPIRE award is partially funded by the Information Integration and Informatics Program in the Division of Information and Intelligent Systems in the Directorate for Computer and Information Science and Engineering and the Solid State and Materials Chemistry Program in the Division of Materials Research and the Office of Multidisciplinary Activities in the Directorate for Mathematical and Physical Sciences.
The past two decades have seen a rapid development in experimental high-throughput experimentation (HTE) methodologies that would be extremely valuable for (i) the discovery of new applied materials with high complexity and (ii) the generation of deep understanding of structure/function, structure/activity and structure/performance relationships. Especially high photon flux X-ray techniques have enormous transformative potential in materials discovery. The research team leverages the data being collected by the Cornell High Energy Synchrotron Source (CHESS) and at Caltechs Joint Center for Artificial Photosynthesis (JCAP). While high-throughput inorganic library synthesis is relatively well-established, high-throughput structure determination, which is at the heart of the proposed research, is in its infancy. X-ray diffraction is well-suited for rapidly collecting information on the atomic arrangements in an inorganic sample, but the data do not immediately reveal a crystal structure. The development of data analysis, data mining and interpretation methodologies has not kept pace with the development of experimental capability. Consequently, data acquired in a week can take many months of traditional analysis by researchers. Automation and machine-intelligent processing of the data are absolutely necessary to maximise the impact of complex multidimensional datasets.
This project addresses this state of affairs head-on; It investigates computational techniques that allow dealing with the multiparameter space associated with HTE structure determination of materials libraries, through constraint guided search adn optimization, statistical machine learning, and inference techniques in combination with direct human input into the process. Anticipated advances include new probabilistic methods and computational discovery tools that integrate soft and hard constraints that capture the complex background knowledge from the underlying physics and chemistry of materials with insights gained from high throughput data analytics and machine learning. If the project succeeds in achieving the anticipated enormous efficiency gains in complex structure determination, it could have have a transformative impact on materials discovery and complex solid state chemistry and physics.
The ability to reduce complex materials dicovery and optimization from timeframes of months or years to hours or days could lead to a paradigm shift in the development of products benefiting society, with technological advances as well as commercial impact on energy, sustainability, health and quality of life. The planned free dissemination of data sets and computational tools to the larger scientific community is likely to enhance the broader impacts of the project. The project facilitates increased interdisciplinary interactions between computer scientists and material scientists at Cornell University and offer enhanced opportunities for training of a new generation of researchers at the interface between the two disciplines.