In a Multiple Sequence Alignment (MSA) of evolutionary diverse protein sequences, functionally important amino acid residues are observed as conserved sites. It is also known that other less conserved sites influence the native conformation and function of proteins, and are therefore functionally important as well. It is hypothesized that when these sites mutate, this is followed by compensatory mutations elsewhere in the protein so that structure or functionality is preserved. However, such co-evolving sites are not straightforward to identify. We have applied information theory to the human GPCR proteome to compute non-conserved sites which co-evolve. We have found that for each class of GPCR (i.e. A, B and C), a cohort of amino acid pairs have high mutual information. From these pairs, we constructed a mutual information graph, where the nodes represent residue positions and the edges are weighted by the mutual information between positions. We extracted a statistically significant cohort of positions from the graph and found that in classes A and C they resided in the ligand binding cavity. However, for class B, for which no known binding cavity has been observed, we did not find positions in the cavity. We are now examining the nucleotide sequence to see what kind of evolutionary pressure (purifying, adaptive, or neutral) these key positions may have been under. We have since applied our methods to other molecular families.
|Fatakia, Sarosh N; Costanzi, Stefano; Chow, Carson C (2011) Molecular evolution of the transmembrane domains of G protein-coupled receptors. PLoS One 6:e27813|
|Milac, Adina; Anishkin, Andriy; Fatakia, Sarosh N et al. (2011) Structural models of TREK channels and their gating mechanism. Channels (Austin) 5:23-33|
|Fatakia, Sarosh N; Costanzi, Stefano; Chow, Carson C (2009) Computing highly correlated positions using mutual information and graph theory for G protein-coupled receptors. PLoS One 4:e4681|