The rapid proliferation of cloud computing and the emergence of big data for massive scientific, financial, governmental, and genetic records is creating an information storage crisis. These data, once generated, cascade through the information storage lifecycle -- from primary storage media in the form of hard disks and solid-state drives to archival media such as magnetic tape. While innovations in information density, stability, and energy consumption routinely occur, existing memory materials are approaching their physical and economic finish lines. As imagined by the Semiconductor Synthetic Biology (SemiSynBio) Roadmap, DNA-based massive information storage is a brand new start for memory manufacturing. As a result, the research team proposes to pioneer a cold storage paradigm by designing, building, and testing two accessible, editable, and non-volatile memory technologies made from DNA. Inspired by DNA circuits and made possible by state-of-the-art optical physics, the team will: (1) biologically synthesize DNA molecules, (2) engineer substrates made from said molecules, (3) write digital information onto the substrates using additional DNA molecules, (4) minimize encoding and decoding errors using computer science algorithms, and (5) read, as well as edit digital information onto the substrates using reversible DNA binding. In full support of this interdisciplinary project, the research team includes expertise in: DNA nanotechnology, nanoscale characterization, optical physics, biologically-inspired algorithms, and synthetic biology. Modeled after the faculty collaboration, a new cadre of students will work and study at the confluence of the biological, computational, and engineering sciences in anticipation of the emerging field called Nucleic Acid Memory (NAM). As active participants in and co-owners of a Vertically Integrated Project called NAM, undergraduate and graduate students will enroll into a multi-year and multi-disciplinary research team that provides ongoing course and teaching credit.

The focal points of this proposal are two storage medium prototypes, digital Nucleic Acid (dNAM) and sequence Nucleic Acid Memory (seqNAM). Each offer a novel approach to coding information using DNA, and both use super-resolution microscopy to read information. In dNAM, information is encoded into defined spatial arrangements of DNA sequences on top of addressable DNA origami nanostructures, called NAM storage nodes. DNA origami provides a convenient pathway and a proven approach to high-yield and rapid prototyping of NAM node structures. Staple strands will be extended from the NAM node structures with a unique sequence for site-specific attachment of NAM data strands. When bound, data strands serve as docking sites for complementary data imager strands, which are employed in a DNA-based form of super-resolution microscopy (SRM) called DNA-PAINT. DNA PAINT is a stochastic super-resolution imaging technique that uses repetitive, transient binding of fluorescently labeled data imager strands to circumvent the diffraction limit of light. Thus, data imager strands act as the read head and reveal the state of each site of the NAM storage node with better than 7 nm resolution. Binary states at each data cell can be defined by the presence (1) or absence (0) of the NAM data strand, as determined by SRM. Increasing site-specific bit-density from 1 to 3 bits can be simply achieved by multiple orthogonal sequences. Editing of data strands is performed by either adding a required data strand to a vacant data cell or by removing an existing data strand via toehold-mediated strand displacement. Built upon a similar storage node platform, seqNAM employs two data cells to arrange data strands into ordered arrays. In seqNAM, information is encoded within portions of the data strands that remain single stranded. The sequences of the data strands are read using a multi-color super-resolution sequencing (SRS) process that uses a library of locked nucleic acid imager strands. Editing is performed by removing the target data strands with complementary sequences using toehold-mediated strand invasion and then adding the replacement data strands. seqNAM exceeds dNAM by storing information within DNA sequences at a potentially higher density. In addition, it creates a new enzyme-free sequencing platform.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Electrical, Communications and Cyber Systems (ECCS)
Application #
1807809
Program Officer
Usha Varshney
Project Start
Project End
Budget Start
2018-07-15
Budget End
2021-06-30
Support Year
Fiscal Year
2018
Total Cost
$1,125,000
Indirect Cost
Name
Boise State University
Department
Type
DUNS #
City
Boise
State
ID
Country
United States
Zip Code
83725