The long-term research goal of my laboratory is to develop computational methods that can be extremely cost-effective in their ability to make functional predictions from DNA sequences, and to explore patterns of genomic evolution (comparative genomics). In this proposal we plan to develop a statistical framework for modeling the alteration of evolutionary rate and substitution pattern during a certain period of evolutionary time (the rate-pattern change). Our objective is to provide a statistically sound methodology for identifying amino acid residues that are responsible for functional divergence among homologous genes of a gene family, based on an auxiliary principle (working-hypothesis) that there is an intrinsic connection between evolutionary rate-pattern change and structural/functional change of a protein family. These newly- developed methods may have great potential for biomedical and pharmaceutical research in the era of functional genomics.
Our specific aims are (1) develop a novel stochastic model for gene family evolution with special reference to evolutionary rate- pattern change after gene duplication, and implement an efficient maximum likelihood algorithm for sequence analysis; (2) develop the framework of a hidden Markov model (HMM) for predicting amino acid residues that have experienced rate-pattern changes among homologous genes of a gene family; (3) extend these methods such that they can be utilized with complicated cases such as a huge gene family; and (4) Test the relationship between sequence- diversity and function/structural diversity (the auxiliary principle) by using protein families for which related biological information is available, e.g., whether those residues important for functional divergence that have been verified by site- mutagenesis always receive high HMM scores. Our methods may have many applications from understanding the 3D structure basis of functional divergence, predicting critical residues, to exploring the pattern of functional divergence among homologous genes at the genome level.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM062118-04
Application #
6642068
Study Section
Genome Study Section (GNM)
Program Officer
Eckstrand, Irene A
Project Start
2000-09-01
Project End
2005-08-31
Budget Start
2003-09-01
Budget End
2005-08-31
Support Year
4
Fiscal Year
2003
Total Cost
$144,500
Indirect Cost
Name
Iowa State University
Department
Genetics
Type
Schools of Arts and Sciences
DUNS #
005309844
City
Ames
State
IA
Country
United States
Zip Code
50011
Gao, Xiang; Vander Velden, Kent A; Voytas, Daniel F et al. (2005) SplitTester: software to identify domains responsible for functional divergence in protein family. BMC Bioinformatics 6:137
Gu, Xun; Huang, Wei; Xu, Dongping et al. (2005) GeneContent: software for whole-genome phylogenetic analysis. Bioinformatics 21:1713-4
Gu, Xun; Zhang, Hongmei (2004) Genome phylogenetic analysis based on extended gene contents. Mol Biol Evol 21:1401-8
Wu, Shiquan; Gu, Xun (2003) Algorithms for multiple genome rearrangement by signed reversals. Pac Symp Biocomput :363-74
Gu, Jianying; Gu, Xun (2003) Natural history and functional divergence of protein tyrosine kinases. Gene 317:49-57
Gu, Xun (2003) Functional divergence in protein (family) sequence evolution. Genetica 118:133-41
Gu, Xun; Vander Velden, Kent (2002) DIVERGE: phylogeny-based analysis for functional-structural divergence of a protein family. Bioinformatics 18:500-1
Gu, Jianying; Wang, Yufeng; Gu, Xun (2002) Evolutionary analysis for functional divergence of Jak protein kinase domains and tissue-specific genes. J Mol Evol 54:725-33
Gu, Xun; Huang, Wei (2002) Testing the parsimony test of genome duplications: a counterexample. Genome Res 12:1-2
Wu, Shiquan; Gu, Xun (2002) Multiple genome rearrangement by reversals. Pac Symp Biocomput :259-70

Showing the most recent 10 out of 15 publications