The long-term research goal of my laboratory is to develop computational methods that can be extremely cost-effective in their ability to make functional predictions from DNA sequences, and to explore patterns of genomic evolution (comparative genomics). In this proposal we plan to develop a statistical framework for modeling the alteration of evolutionary rate and substitution pattern during a certain period of evolutionary time (the rate-pattern change). Our objective is to provide a statistically sound methodology for identifying amino acid residues that are responsible for functional divergence among homologous genes of a gene family, based on an auxiliary principle (working-hypothesis) that there is an intrinsic connection between evolutionary rate-pattern change and structural/functional change of a protein family. These newly- developed methods may have great potential for biomedical and pharmaceutical research in the era of functional genomics.
Our specific aims are (1) develop a novel stochastic model for gene family evolution with special reference to evolutionary rate- pattern change after gene duplication, and implement an efficient maximum likelihood algorithm for sequence analysis; (2) develop the framework of a hidden Markov model (HMM) for predicting amino acid residues that have experienced rate-pattern changes among homologous genes of a gene family; (3) extend these methods such that they can be utilized with complicated cases such as a huge gene family; and (4) Test the relationship between sequence- diversity and function/structural diversity (the auxiliary principle) by using protein families for which related biological information is available, e.g., whether those residues important for functional divergence that have been verified by site- mutagenesis always receive high HMM scores. Our methods may have many applications from understanding the 3D structure basis of functional divergence, predicting critical residues, to exploring the pattern of functional divergence among homologous genes at the genome level.
Showing the most recent 10 out of 15 publications