Only three biological examples of massive protein sequence variation are known to exist. The most extensively characterized example occurs in the immune system of jawed vertebrates (i.e., antibodies and T-cell receptors, whose immunoglobulin-fold accommodates ~1014-1016 possible sequences), and the second, less characterized example in the immune system of jawless vertebrates (leucine-rich repeat-fold proteins with ~1014 sequences). The third example was recently discovered in a class of prokaryotic retroelements, termed diversity-generating retroelements (DGR), found in diverse bacterial species, including the human pathogen Bordetella and members of the human periodontal and intestinal microbiota. The central feature of DGRs is production of massive protein sequence variation through a unique adenine-specific, template-based mechanism. The most extensively characterized DGR-encoded variable protein is Mtd (~1013 sequences), which functions as the receptor-binding protein of Bordetella bacteriophage;variation in Mtd enables host tropism switching by the phage. To understand how Mtd accommodates massive sequence variation, we determined the structures of a number of Mtd variants, and characterized the interaction of Mtd with one of its prevalent Bordetella receptors, pertactin. The structures revealed that Mtd uses a C-type lectin fold as a novel scaffold for massive sequence variation, and that the receptor-binding site of Mtd is remarkably stable to sequence variation. Our long-term objective is to understand how DGR-encoded variable proteins accommodate massive sequence variation and bind diverse targets. We specifically aim to use the power of phage genetics combined with detailed insights from biochemistry and structural biology to determine the basis for (1) receptor recognition by Mtd and (2) accommodation of massive sequence variation by other DGR-encoded variable proteins. The justification for these studies comes from the fact that massive protein sequence variation is extremely rare in biology. An understanding of DGR-encoded variable proteins will likely provide basic insight into modes of macromolecular recognition, a fundamental process in all biological systems, as well as place limits on the biological function of DGR-encoded variable proteins. The results from our studies may also have practical applications. Antibodies are nearly unparalleled in the natural work in their ability to vary in sequence and bind almost any target antigen. Recently bacteria were found to encode proteins that vary in sequence at a scale comparable to antibodies. Our project is aimed at providing basic knowledge on how these variable bacterial proteins recognize diverse targets, with the expectation that this knowledge will have practical applications.
Le Coq, Johanne; Ghosh, Partho (2011) Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement. Proc Natl Acad Sci U S A 108:14649-53 |
Dai, Wei; Hodes, Asher; Hui, Wong H et al. (2010) Three-dimensional structure of tropism-switching Bordetella bacteriophage. Proc Natl Acad Sci U S A 107:4347-52 |
Miller, Jason L; Le Coq, Johanne; Hodes, Asher et al. (2008) Selective ligand recognition by a diversity-generating retroelement variable protein. PLoS Biol 6:e131 |