DNA-based data storage is an emerging recording paradigm that has received significant attention from the scientific community due to several recent demonstrations of the viability of storing information in macro-molecules. Unlike classical optical and magnetic storage technologies, DNA-based storage platforms offer extremely high recording densities, and they do not require electrical supply to maintain data integrity. Furthermore, under mild maintenance conditions, DNA retains its information content for centuries while still allowing users to retrieve the information independent of the specific reading technology. Still, despite the promises of DNA-based archival systems, several problems remain that prevent wide-scale deployment of the technology. These include the high cost of DNA synthesis, the lack of structural and distributed organization of data encoded in DNA, and the nonexistence of an integrated random access and readout mechanism. To address these issues, this collaborative project aims to test and implement a new molecular storage paradigm that combines unique ideas in polymer chemistry, coding theory and molecular dynamics modeling, as well as new nano-material and solid state nano-pore technologies. The accompanying interdisciplinary research and educational programs involve experts in chemistry, biophysics, electrical engineering and theoretical computer science and aim to train a new cadre of students able to address future scientific challenges in molecular storage and computing systems.

The technical goal of the proposed program is to reduce the cost-integration barrier between classical recorders and DNA-based data storage devices by developing a new system centered around chimeric DNA, comprising cheap native DNA and chemically modified nucleotides. Chemically-modified nucleotides extend the coding alphabet from four symbols to more than twenty. Chimeric DNA is stored and accessed using a novel implementation of self-rolled semiconductor micro-tubular grids, controlled by three-dimensional arrays of electrodes. Random access in such systems is achieved via voltage modulation, with selected DNA guided into a sample preparation and specialized nano-pore sequencing device. The implementation of such systems is aided by new software tools for molecular dynamics simulations. Additional system support is provided via new coding methods that combat the effects of chimeric DNA integration and nano-pore sensing errors. Particular research challenges include identifying chemical modifications in nucleotides amenable for detection via nano-pore sequencers, calculating electrostatic forces within the tubes and within the pores in the presence of chimeric DNA, and integrating the micro tubular chip with an on-chip sample preparation and sensing device. Supporting work on bioinformatics and coding theoretic algorithmic development are expected to ensure additional robustness and operational stability of the proposed system.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
1807526
Program Officer
Mitra Basu
Project Start
Project End
Budget Start
2018-10-01
Budget End
2021-09-30
Support Year
Fiscal Year
2018
Total Cost
$875,000
Indirect Cost
Name
University of Illinois Urbana-Champaign
Department
Type
DUNS #
City
Champaign
State
IL
Country
United States
Zip Code
61820