Intellectual Merit: The rapid development of inexpensive, high fidelity, and high-throughput sequencing technologies has facilitated the determination of countless genomes. Sequencing data have led to valuable insights into the descent of species and have played a key role in the annotation of genes in related organisms. However, the elucidation of the direct relationship between a gene and its precise function is still laborious. Similarly, although functional domains in proteins are often known, determining which amino acids are essential and which amino acid substitutions are allowed still requires an enormous time investment. This project leverages high-throughput sequencing technology to measure genome-wide utilization of genes at the nucleotide level. Bacterial cells will be grown under well-defined experimental conditions in the continual presence of a strong mutagen. This treatment will result in the accumulation of mutations in unutilized genes or in parts of utilized genes coding for replaceable amino acids. The introduced variation will be detected by DNA sequencing and the results should distinguish utilized genes from unutilized ones, and replaceable amino acids from those required for protein function.

Broader Impacts: This platform will be widely applicable, both for industrial and academic purposes to enable protocol-driven gene discovery. The method does not require much upfront knowledge about metabolism, and uses biocomputing for assessing gene function. The project will provide educational opportunities for a postdoctoral fellow and for undergraduate students.

Project Report

The rapid development of inexpensive, high fidelity, and high-throughput sequencing technologies has facilitated the determination of countless genomes. Sequencing data have led to valuable insights into the descent of species, and played a key role in the annotation of genes in related organisms. However, the elucidation of the direct relationship between a gene and its precise function is still laborious. At a much smaller scale, the same is true for individual amino acids in proteins. Although functional domains in proteins are often known, determining which amino acids are essential and which amino acid substitutions are allowed still requires an enormous time investment. This project aimed to use high-throughput sequencing technology to measure genome-wide utilization of genes at the nucleotide level. Upon introduction of variation by constant application of the mutagen nitrite, the accumulation of mutations in unutilized genes or in parts of utilized genes that code for replaceable amino acids, is detected using deep sequencing of the treated culture. In preparation for data anticipated for this project, we developed a statistical approach to measure gene essentiality. In many ways this project has change the way that my lab thinks about how to gather data for integrative biology. We have now started several pilot experiments that all utilize deep-sequencing of DNA as method to measure complex phenotypes. A New Brunswick Bioflo 110 multi-vessel bioreactor setup was modified to facilitate the proposed experiment. The nutrient, pH, and mutagen reservoirs were fitted with steam sterilizable quick connects to create a setup that could run for months at the time. To provide the constant environment for the cell culture required for this experiment, the cell density had to be kept constant. In addition, the maximum genetic variation per cell per unit time is achieved by a constant exposure to mutagen at a concentration that causes slightly less than one lethal mutation per cell division (an almost zero growth rate). Software was developed to control the bioflo bioreactor, to measure cell density through oxygen consumption measurements, and to control auxiliary hardware to remove liquid from the reactor vessels and take samples. This software automatically controlled the cell density by dilution with fresh media, and the net growth rate by adjusting the mutagen concentration in the reactor vessel. Bioflo bioreactors are amongst the most commonly used bioreactors, and we made our software freely available on Github for the benefit of researchers in academia or industry. Several STEM educated students were involved in biological experimentation as a result of this project. Hong Yang (chemical engineering) and David Burdge (mechanical engineering) were the two principle scientists working to realize the goals of the grant. Hong Yang has been very involved in getting two undergraduates Dominic Mandy and Eli Krumholz up to speed with metabolic modeling. My lab provides an interdisciplinary environment that contributes to introducing scientist of the future with the wide breath of disciplines that they need to be at ease with.

Agency
National Science Foundation (NSF)
Institute
Division of Molecular and Cellular Biosciences (MCB)
Type
Standard Grant (Standard)
Application #
1042335
Program Officer
Gregory W. Warr
Project Start
Project End
Budget Start
2010-09-15
Budget End
2013-08-31
Support Year
Fiscal Year
2010
Total Cost
$256,919
Indirect Cost
Name
University of Minnesota Twin Cities
Department
Type
DUNS #
City
Minneapolis
State
MN
Country
United States
Zip Code
55455