RNA nanostructure design has received unprecedented attention due to the number of emerging applications in different scientific fields, such as diagnostics, therapeutics, synthetic biology, biological materials, and molecular programming. However, the design and synthesis of long RNA molecules with improved stability, programmable geometries, and controllable functions is an incredibly challenging task. The difficulties of large RNA design are due to their long sequences and complex interactions between bases. In addition, once a structure is designed, conducting experiments is time-consuming and expensive. It is invaluable to develop a platform with effective design algorithms and tools for RNA design with high efficiency and accuracy. This project will advance national health prosperity and welfare, providing the required knowledge for the design and synthesis of long RNAs with the desired functionalities and improved stabilities. These RNAs will have an important potential impact in applications such as drug delivery and cancer therapy. The team will develop new computational methods enabling support to the discovery of next-generation nanostructure in a more efficient and informed manner. It will also improve the understanding and knowledge of fundamental rules that characterize the folding of large-scale RNA sequences. The project will involve algorithm development along with experimental activities. As a result, educational material will be developed across science and engineering programs bringing a diverse group of students together.
Most existing RNA-design algorithms focus on conserved, naturally evolved 3D RNA motifs. These algorithms employ the idea of a ?block?, which consists of nucleotides (nts) at the scale of ~10nts, and investigate the possible bindings among nucleotides pairs within and between the blocks (block-driven approach). Current approaches suffer from the low accuracy for prediction of large RNA molecules folding (>200 nts). This is a critical issue because there is a compelling need to generate longer sequences to fully exploit RNA functionalities such as catalysis, gene regulation, organization of proteins in large machineries, and their use in material and biomedical sciences. Several challenges make the task of designing large-scale RNA structures hard. As an example, RNA compounds can be stable even when they are not minimum free-energy configurations. Also, experiments have shown how alternative configurations can exist for the same RNA sequence, with different associated levels of minimum free energy. It is therefore necessary to come up with approaches, experimental as well as computational, that can be agnostic to a reward function, e.g., embed data-driven information to determine the likelihood of an RNA configuration to exist. This multidisciplinary project will tackle two main challenges for the development of the design of RNA structures. (i) Perform optimization without explicit knowledge of a reward function. The concept of optimization driven by empirically developed experts will be developed in this project. (ii) Scale the methods in (i) to high-dimensional cases. A bio-inspired concept of tile is employed to create a computationally efficient algorithmic framework to generate and explore tiles, which will be evaluated using expert-driven rollout over chains of tiles. The produced algorithms will be subsequently validated using existing RNA databases. New RNA building blocks will be proposed and constructed through the algorithmic framework, to be validated as an assistant tool to the design of single-stranded RNA origami structures with increasing size and complexity that could potentially rival the natural RNA machineries or designer DNA nanostructures.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.