The Exon Theory of Genes hypothesizes that ancient genomes had an intron-exon structure that allowed for rapid diversification of protein structure and function through recombination of exons. While much research has investigated which modern exons correspond to structural units in proteins, and structural models have identified putative protein building blocks ubiquitous among non-homologous proteins, the principles that govern whether a polypeptide can be exchanged among different proteins remain unclear. We have developed an algorithm, called SCHEMA, to predict what elements, or schemata, of a protein can be swapped among homologous proteins without disrupting the folded structure. We propose a combination of biochemical and computational studies using SCHEMA and other novel algorithms, whose goals are to elucidate the rules governing non-disruptive recombination and evolution of novel functions by recombination.
Our specific aims are to: 1) determine the SCHEMA-predicted threshold(s) of tolerable structural disruption upon homologous recombination for lactamases and cytochrome P450 monooxygenases; 2) develop novel algorithms for predicting efficient recombination fitness searches; 3) characterize the effectiveness of predicted search strategies through laboratory evolution of lactamases and cytochrome P450s; 4) optimize predictions of recombinant structural disruption; and 5) investigate if nonhomologous proteins can be recombined to generate folded proteins, using the algorithms to guide crossover locations. These studies should allow us to discover when homologous and non-homologous recombination conserves protein structure and expand our understanding of how evolution explores sequence, structural, and functional diversity. Furthermore, these studies should generate new tools for protein engineering by laboratory evolution, with biomedical applications in the development of new biomaterials, biosensors, catalysts, and protein-based therapeutics.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Molecular and Cellular Biophysics Study Section (BBCA)
Program Officer
Wehrle, Janna P
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
California Institute of Technology
Schools of Engineering
United States
Zip Code
Romero, Philip A; Shapiro, Mikhail G; Arnold, Frances H et al. (2013) Directed evolution of protein-based neurotransmitter sensors for MRI. Methods Mol Biol 995:193-205
Heinzelman, Pete; Romero, Philip A; Arnold, Frances H (2013) Efficient sampling of SCHEMA chimera families to identify useful sequence elements. Methods Enzymol 523:351-68
Romero, Philip A; Arnold, Frances H (2012) Random field model reveals structure of the protein recombinational landscape. PLoS Comput Biol 8:e1002713
Romero, Philip A; Stone, Everett; Lamb, Candice et al. (2012) SCHEMA-designed variants of human Arginase I and II reveal sequence elements important to stability and catalysis. ACS Synth Biol 1:221-8
Jung, Sang Taek; Lauchli, Ryan; Arnold, Frances H (2011) Cytochrome P450: taming a wild type enzyme. Curr Opin Biotechnol 22:809-17
Lewis, Jared C; Coelho, Pedro S; Arnold, Frances H (2011) Enzymatic functionalization of carbon-hydrogen bonds. Chem Soc Rev 40:2003-21
Lewis, Jared C; Mantovani, Simone M; Fu, Yu et al. (2010) Combinatorial alanine substitution enables rapid optimization of cytochrome P450BM3 for selective hydroxylation of large substrates. Chembiochem 11:2502-5
Jackel, Christian; Bloom, Jesse D; Kast, Peter et al. (2010) Consensus protein design without phylogenetic bias. J Mol Biol 399:541-6
Acar, Murat; Pando, Bernardo F; Arnold, Frances H et al. (2010) A general mechanism for network-dosage compensation in gene circuits. Science 329:1656-60
Shapiro, Mikhail G; Westmeyer, Gil G; Romero, Philip A et al. (2010) Directed evolution of a magnetic resonance imaging contrast agent for noninvasive imaging of dopamine. Nat Biotechnol 28:264-70

Showing the most recent 10 out of 24 publications