Most symmetric proteins have a relatively small core unit, which is repeated. These are simple structures compared to proteins that are not symmetric. Yet, they appear to be capable of carrying out all types of functions. Some are enzymes, others are carriers of proteins, still others are receptors, etc. Therefore, if one is interested in designing proteins de novo to perform a specific function, symmetric proteins are probably a good start. They should also be good molecules with which to study the sequence-structure-function relations because of their relative simplicity. The evolutionary history of these proteins is also interesting. These proteins probably arose by gene duplication and fusion. Although mutation rates will vary depending on the requirement of symmetry for function, generally those that have highly sequence similar repeats presumably arose late, compared to those for which the similarity is beginning to disappear. After sufficient time, the sequence similarity will disappear and structural symmetry will also be degraded. Thus, the symmetry should generally give an additional handle for following the evolution of these proteins. The interest in symmetric structures seems to be rising;there were only a few reports on symmetry detection prior to 2008, but at least four different groups including us reported separate symmetry detection methods in the past three years. Our symmetry detection program, SymD (Kim et al. BMC Bioinformatics 11:303, 2010), is based on two algorithms that we developed earlier, SE (Seed Extension;Tai et al., BMC Bioinformatics, 10 Suppl 1:S4, 2009) and RSE (Refinement with SE;Kim et al. BMC Bioinformatics 10:210, 2009). SE finds the optimal structure-based sequence alignment given a structure superposition without using the dynamic programming algorithm or a gap penalty. RSE uses SE and the Kabsch algorithm to find the optimal structure superposition and structure-based sequence alignment given an initial structure superposition or sequence alignment. SymD itself works by optimally aligning, using RSE, a protein structure to itself after circularly permuting the second copy by k residues for all k values from 1 to N-3 residues where N is the total number of residues of the protein. The SymD procedure is superior to other symmetric protein detection methods in several aspects: (1) The procedure allows detection of symmetry even when the structure contains symmetry-breaking insertions or deletions either within or between the repeating units. (2) The procedure depends and uses the symmetry of the molecule. It is a symmetry detector, not just a repeat detector. (3) The procedure is sensitive because it amplifies symmetric signal. (4) The procedure yields the sequence alignment between repeating units and the position and orientation of the symmetry axis. (5) The procedure is capable of detecting more than one symmetry for a molecule. Using this program, we determined that approximately 20% of all distinct protein domains (SCOP 1.75 ASTRAL 40% domain dataset) may be considered globally symmetric. These include most of the well-known symmetric folds, including TIM barrels, alpha-alpha superhelices and toroids, beta-trefoils, beta-propellers, leucine-rich repeats, ferredoxins, etc. The symmetries observed are broadly of three types: slip, closed and open. Slip symmetric proteins look invariant after a translation by a few residues in one direction. As far as we know, we are the first to recognize this invariance and to consider it as a type of symmetry (manuscript in preparation). These are mostly helix bundles. In symmetric closed structures, the N- and C-termini of the molecule come close together and the two ends of the molecule are stitched together, often by using a set of hydrogen bonds (the Velcro joining). Most of these have 2- to 8-fold rotational symmetries, but the transmembrane beta-barrels can have higher symmetries and also the screw symmetries. In the symmetric open structures, the N- and C-termini are at the opposite ends of the molecule. All have a helical or a pure 2-fold rotational symmetry. A protein with a pure 2-fold rotational symmetry can have either a closed (intertwined) or an open structure. Current research effort is directed to (1) characterizing the small number of protein domains that have two or more symmetry elements, (2) perfecting the algorithm for automatic classification of observed symmetries, and (3) developing an algorithm for detecting locally symmetric sub-structures that are imbedded in a larger, globally non-symmetric structures. Future efforts will be directed to collecting repeating units and studying their structure and interaction.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Investigator-Initiated Intramural Research Projects (ZIA)
Project #
1ZIABC011233-03
Application #
8349419
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
3
Fiscal Year
2011
Total Cost
$629,777
Indirect Cost
Name
National Cancer Institute Division of Basic Sciences
Department
Type
DUNS #
City
State
Country
Zip Code
Mazor, Ronit; Tai, Chin-Hsien; Lee, Byungkook et al. (2015) Poor correlation between T-cell activation assays and HLA-DR binding prediction algorithms in an immunogenic fragment of Pseudomonas exotoxin A. J Immunol Methods 425:10-20
Taylor, Todd J; Bai, Hongjun; Tai, Chin-Hsien et al. (2014) Assessment of CASP10 contact-assisted predictions. Proteins 82 Suppl 2:84-97
Tai, Chin-Hsien; Bai, Hongjun; Taylor, Todd J et al. (2014) Assessment of template-free modeling in CASP10 and ROLL. Proteins 82 Suppl 2:57-83
Mazor, Ronit; Eberle, Jaime A; Hu, Xiaobo et al. (2014) Recombinant immunotoxin for cancer treatment with low immunogenicity by identification and silencing of human T-cell epitopes. Proc Natl Acad Sci U S A 111:8571-6
Pak, Youngshang; Pastan, Ira; Kreitman, Robert J et al. (2014) Effect of antigen shedding on targeted delivery of immunotoxins in solid tumors from a mathematical model. PLoS One 9:e110716
Taylor, Todd J; Tai, Chin-Hsien; Huang, Yuanpeng J et al. (2014) Definition and classification of evaluation units for CASP10. Proteins 82 Suppl 2:14-25
Tai, Chin-Hsien; Paul, Rohit; Dukka, K C et al. (2014) SymD webserver: a platform for detecting internally symmetric protein structures. Nucleic Acids Res 42:W296-300
Liu, Wenhai; Onda, Masanori; Kim, Changhoon et al. (2012) A recombinant immunotoxin engineered for increased stability by adding a disulfide bond has decreased immunogenicity. Protein Eng Des Sel 25:1-6
Samson, Franck; Shrager, Richard; Tai, Chin-Hsien et al. (2012) DOMIRE: a web server for identifying structural domains and their neighbors in proteins. Bioinformatics 28:1040-1
Feiglin, Ariel; Moult, John; Lee, Byungkook et al. (2012) Neighbor overlap is enriched in the yeast interaction network: analysis and implications. PLoS One 7:e39662

Showing the most recent 10 out of 17 publications