This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. In the synthetic biology community, bacteria are re-engineered to carry out tasks that benefit human society, including the production of biofuels, therapeutics, plastics, and other important chemicals. With the latest advances in genetic engineering techniques, including DNA synthesis, the challenge is to identify the DNA sequence that reliably yields a desired behavior without resorting to trial-and-error procedures. Specifically, it is important to precisely control the production rate of a protein in order to design a biological system that exhibits a desired metabolic or regulatory behavior. To solve this problem, we have developed a design algorithm that generates a DNA sequence that causes a protein to be produced at a user-specified rate. A user inputs an arbitrary protein coding sequence and a target production rate and the algorithm outputs the RNA (DNA) sequence that yields the target rate. More specifically, the algorithm predicts the translation initiation rate of a protein coding sequence from a messenger RNA transcript by modeling the interactions between the ribosome binding site on the mRNA with the ribosome. Using the predictive model, a Monte Carlo optimization technique identifies the RNA (DNA) sequence that meets the user's target translation rate with the given protein coding sequence. These predictions and the design process have been experimentally validated in the bacterium Escherichia coli (a common industrial organism). The design algorithm predicts translation rates across four orders of magnitude and its accuracy is ~1 kT in terms of the Gibbs free energy. It is written in C and Python. We have created a web-accessible version of the design algorithm at http://voigtlab.ucsf.edu/software/. The computational cost of the design algorithm is moderate;a single design job can require between 5 and 30 minutes on an Intel 1.6Ghz dual core processor (using only 1 core) with 2Gb RAM. The synthetic biology community (at least 1000 PIs/postdocs/graduate students) would make frequent use of this tool. Accordingly, we request 30,000 SUs of Teragrid Roaming resources to perform the computations. Any combination of suitable resources would be appreciated, such as the Purdue Condor Pool, the IU Quarry IBM HS21 bladeserver, or the TACC Ranger. A single user may request 10-100 or more design jobs annually, yielding ~50 SU/user/year and up to ~600 users/year. From his PhD graduate studies, Howard Salis has extensive experience with developing algorithms and running jobs on the Teragrid, including parallel programming with MPI.
Showing the most recent 10 out of 292 publications