In one group of studies we have been studying the very most conserved parts of proteins, which comprise a small fraction of the total structure, typically 6 - 20 residues. At the same time we are also investigating the correlations of various molecular properties, such as packing, with the sequence conserved regions. Understanding the physical basis in structure for sequence conservation would enable us to make some critical predictions of protein cores from sequences alone. We studied several proteins, most recently lysozyme and a-lactalbumin, where we find a nucleus of residues that connect among all of the secondary structure elements and could act as a critical nucleus. In addition we are developing several new approaches to threading to consider cores and sequence conservation in a comprehensive way. We also have been studying protein folding intermediates by fluorescence to determine intramolecular distances in order to compare these with the native structure distances. In cases to date, the distances in the molten globule intermediates are similar to the native state distances. Another goal of computational biology is to understand molecular mechanisms. Protein structures treated with conventional molecular dynamics have not been so informative about large scale motions. We are investigating protein dynamics with a new coarse-grained model having only one point per residue. This new approach represents a simple way to infer functional behavior from structures. It considers fluctuations about known protein structures based on a Gaussian network model. This procedure has been shown to sample satisfactorily the distribution of residue fluctuations around the native conformation in proteins, and to yield excellent agreement with crystallographic temperature factors and hydrogen exchange data, for a broad variety of proteins and nucleic acid structures. Since this method is simple, results are intuitive and compelling. The approach yields a series of modes of motion, typically hinge bending motions, including even the slowest, most global motions. In a recent method development we have extended the calculations from scalar to vector, so that we are now able to follow translational deformations. This opens new and exciting prospects for comprehending the total functional dynamics of extremely large, even supra-molecular structures. In another computational improvement, the time required for calculations has been significantly reduced by several orders of magnitude. Our recent studies with this approach have included: 1) reverse transcriptase in which we showed how the anti-correlations between the motions of the fingers/thumb binding site and the ribonuclease H site could lead to a step-wise processing mechanism for the progression of the nucleic acid chain through the enzyme in a release-pull-turn series of motions; 2) t-RNA free and bound to its cognate synthetase (both show similar motions, independently and together); 3) the GroEl-GroES protein chaperone system which is an extremely large system (8000 residues), to show how the cavity is compressed in many different ways and how the available binding interior biding surface changes through these motions; and 4) tubulin where the dimerization is critical for enhancing the cooperativity of motions and the dimer motions include a wobble between the subunits, elongation and compression along the long axis of the dimer, and twisting of the two monomers in directions opposite to one another. Recent applications have included tubulin in its fibrillar form, as well as several other nucleic acid binding proteins. Anticipated applications include studies of binding and conformational transitions for a broad variety of proteins. We now can extend these calculations to extremely large structures (1 million amino acids). Functional motions, e.g. processing, usually depend only on shape and not on all details of structure. Typically these can be hinges between two domains, stretching of whole structure, rotations between subunits. The important finding is that some local loops over binding sites open and close during some of these large scale motions, not independently. This has important implications for sequence conservation. The entire protein structure is critical for facilitating these apparently local motions! New Fast Structure Determination. By combining computed structure libraries with surface distance measurements using crosslinked structures whose links are identified by mass spectroscopy. Phosphorylation to Regulate Proteins. Cascade effects relate to changes in conformation. We are beginning to develop some understanding of why phosphorylation either causes conformational change or it does not. Drug Discovery. An application of the same mathematical formalism (singular value decomposition) utilized for calculating the motions of proteins has also been made to analyze the cell-line screening data. It was possible to cluster the 122 agents into 25 distinct groups, as well as to classify the cell lines themselves, in a highly systematic way. The 60 cell lines cluster into 21 groups, with the strongest groupings found for renal, leukemia and ovarian cancer.