The recent explosion in worldwide data together with the end of Moore's Law and the near-term limits of silicon-based data storage being reached are driving an urgent need for alternative forms of computing and data storage/retrieval platforms. In particular, exabyte-scale datasets are increasingly being generated by the biological sciences and engineering disciplines including genomics, transcriptomics, proteomics, metabolomics, and high-resolution imaging, as well as disparate other scientific fields including climate science, ecology, astronomy, oceanography, sociology, and meteorology, amongst others. In this data revolution, the continuously increasing size of these datasets requires a concomitant increase in available computational power to store, process, and harness them, which is driving a need for revolutionary new, alternative substrates for, and forms of, computing and data storage. Unlike traditional data storage and computing materials such as silicon, the human brain offers a remarkable ability to sense, store, retrieve, and compute information in a manner that is unrivaled by any human-made material. In this research project, analogous modes of information sensing, data storage, retrieval, and computation will be explored in non-traditional computing molecular systems and materials. The over-arching goal of the research is to discover revolutionary new modes of data storage/retrieval, sensing, and computation that rival conventional silicon-based technology, for deployment to benefit society broadly across all domains of data science. Graduate students and postdocs across five institutions will be trained and mentored in a highly interdisciplinary manner to attain this goal and prepare the next-generation of data scientists, chemists, physicists, and engineers to harness the ongoing data revolution. The research will be disseminated to a broad community through news outlets and integration of high school student internships in participating research laboratories.

Large-scale datasets from spatial-temporal calcium imaging of the mouse brain will be recorded into DNA-based, nanoparticle-based, and phononic 2D and 3D soft and hard materials. Continuous spatial-temporal data will first be transformed into discrete data for mapping onto DNA-conjugated fluorophore networks, dynamic barcoded nanoparticle networks, and phononic 2D and 3D materials. Sensing, computation, and data storage/retrieval will be demonstrated as proofs-of-principle in exploiting the chemical properties of molecular networks and materials to recover the encoded neuronal datasets and their sensing and computing processes. Success with any of these three prototypical materials would revolutionize the ability to encode arbitrarily complex, large-scale datasets into complex molecular systems, with the potential to scale across diverse data domains and materials frameworks. The investigators' Autonomous Computing Materials framework will thereby enable the encoding of arbitrary "big data" sets into diverse materials for data storage, sensing, and computing. This project maximizes opportunities for disruptive new computing and data science concepts to emerge from a multi-disciplinary, collaborative team spanning data science, neuroscience, materials science, chemistry, physics, and biological engineering.

This project is part of the National Science Foundation's Harnessing the Data Revolution (HDR) Big Idea activity, and is jointly supported by HDR and the Division of Chemistry within the NSF Directorate of Mathematical and Physical Sciences.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

National Science Foundation (NSF)
Division of Advanced CyberInfrastructure (ACI)
Application #
Program Officer
Daryl Hess
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Colorado at Boulder
United States
Zip Code