We have devised a novel solution to two related problems in protein engineering. (1) Production costs for many proteins of industrial or pharmaceutical importance are prohibitive due to sub-optimal expression of the recombinant protein in heterologous hosts. (2) Most cDNA expression libraries are so full of non-expressible sequences that the frequencies of many desired sequences may be inaccessible to available screening systems. This problem is particularly acute for high-throughput library screening, which will be essential to accelerate pharmaceutical target discovery and validation in the post-genomics era. In Phase I the method was used to improve the expression of the Green Fluorescent Protein (GFP) of Aequorea Victoria by a factor of 30 over the wild-type protein. A mutagenic library of GFP was expressed in E. coli as fusions with chloramphenicol (cam) acetyl transferase (CAT), and was selected for increased cam resistance. In addition, a single mutation was identified which specifically improved the expression of GFP in fusions with other proteins by a factor of 56 over wild-type. In Phase II the method will be used to optimize bacterial expression of several proteins of industrial or pharmaceutical value. The method will also be used to enrich expressed sequence libraries for autonomously folding domains.
The primary goal of this work is to develop new methods for engineering production values and stability in proteins of industrial and phamaceutical importance.