The idea of storing information in deoxyribonucleic acid (DNA) first arose a few decades ago. DNA is intriguing as a storage medium due to its high density, long half-life of over 100 years, and low maintenance costs. There is renewed interest in DNA as a storage medium due to society's growing demand for large volumes of permanent data storage. However, it also poses many new challenges. Unlike electronic media that physically orders its data, DNA strands are stored in a very small space without any order. This project seeks to understand how DNA strands can be designed using a thermodynamically-driven approach.
The project will develop a set of computational and physical tools that enable truly extreme-scale DNA storage systems. A key goal is a set of generalizable design principles for DNA-based storage systems garnered through a deep fundamental understanding of how DNA sequences translate into physical interactions derived from a statistical thermodynamic framework. From this thermodynamic framework, the project will engineer specific data manipulations to make random access and searching for data more efficient and create overall effective architectures for DNA storage systems. The tasks of the project involve exploring, developing models, and testing key thermodynamic knobs for tuning strand interactions.
If the project is successful, it may help lay the foundation to make DNA-based storage practical and affordable. With that, long-term, reliable storage of information may become more abundant. The project will also train multiple Ph.D. students and undergraduate students in related areas of synthetic biology and computer systems engineering. Their training will include interacting with the project leaders, taking coursework relevant to the project, carrying out research tasks, writing and critiquing articles for publication, and presenting their work. The project will also broaden participation in computing through outreach to underserved communities in the areas around North Carolina State University.
The results of this project will be made available through a public website at http://go.ncsu.edu/dna-storage. The website and data collected in the project will be available for at least five years after the end of the project and will remain available until such provision is no longer practical.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.