A grant has been awarded to Stony Brook University to unify, generalize, and expand algorithmic design techniques for synthetic biology, which have proven useful in pilot studies, into a coherent set of tools to address important experimental problems in microbiology. The project will result in algorithmic and software tools to couple imaginative sequence design with DNA synthesis in several areas. In particular, they will improve the robustness, efficiency, and generality of a new approach to identify the locations of critical DNA or RNA sequence signals. This work couples large-scale synthesis with sophisticated designs employing combinatorial group testing and balanced Gray codes. Software tools will be deployed to enable broad dissemination of the technology and improve a new technique exploiting the synthesis of carefully-designed sequences (employing ideas from combinatorics, namely de Bruijn sequences) to titrate transcription factors on a genome-wide scale. This will also address algorithmic research necessary to optimize this class of synthetic sequences, coupled with experimental work to evaluate the efficacy of these designs. Array-based oligo synthesis technologies provide access to thousands of low-cost, custom-designed sequence variants. The algorithms developed for the large-scale design of diverse coding sequences will allow researchers to exploit array-based synthesis technologies and assay their performance. Finally, the advent of synthetic genomics means that laboratory strains can be "refactored", i.e., redesigned to make them easier to experimentally manipulate. The project will build on restriction-site placement algorithms to produce a web-accessible genome factorization tool.

This collaboration between computational and life sciences researchers advances both disciplines, through new algorithmic results in combinatorial algorithms and discrete optimization as well as fundamental discoveries regarding gene expression, transcription factor analysis, and sequence signal detection. The project will result in software and experimental tools to advance broad areas of molecular biology. Beyond the algorithmic contributions of this research, they will develop laboratory materials of general interest. Educational outcomes include mentoring of undergraduate research students. Software and results of this project will be available from the website www.cs.sunysb.edu/~skiena/dna.

Project Report

Our project expanded algorithmic design techniques for synthetic biology into a powerful set of tools to address important experimental problems in microbiology. In particular, we look to design genes to maximize protein expression (important in manufacturing drugs and other biotechnoloigies) or minimize protein expression (important in the design of vaccines). We developed new algorithms to design protein-coding genes with important properties that affect how well they work. Specifically, we showed how to design genes that optimize (1) the amount of secondary structure under constraints, and (2) the amount of reuse of tRNA molecules. Both of these criteria have significant impacts on gene expression and protein production. We have shown in laboratory experiments that our gene designs have the desired properties, pointing the way to a better understanding of how gene expression works. We also developed new techniques to understand data from ribosome profiling, an exciting experimental technology measure effects on gene translation. In particular, we demonstrated that frequently used coding symbols (codons) translated faster than rare codons, resolving an important question in gene translation. Further, we have generalized this technique to study other translation-specific phenomena like autocorrelation, secondary structure, and codon-pair bias. The broader impacts of this work include collaborations between computational and life sciences researchers that advanced both disciplines, through new algorithmic results in combinatorial algorithms and discrete optimization as well as fundamental discoveries regarding gene expression. Our project will resulted in software and experimental tools to advance broad areas of molecular biology. Educational outcomes include the training of graduate students and the mentoring of undergraduate students.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Type
Standard Grant (Standard)
Application #
1060572
Program Officer
Anne Maglia
Project Start
Project End
Budget Start
2011-04-15
Budget End
2015-03-31
Support Year
Fiscal Year
2010
Total Cost
$497,915
Indirect Cost
Name
State University New York Stony Brook
Department
Type
DUNS #
City
Stony Brook
State
NY
Country
United States
Zip Code
11794