Comparative analysis of DNA and amino acid sequences is now routinely employed in tracing the origins, patterns, and evolutionary relationships of homologous sequences. Due to recent advances in DNA sequencing technologies, these datasets now contain increasingly larger numbers of sequences that comprise a family of orthologous (arisen by speciation) and/or paralogous (arisen by gene duplications) sequences from diverse species. Therefore, the need for biologist-centric tools for evolutionary and functional genomics analysis of these data is growing. We propose to address these needs by expanding the scope of Molecular Evolutionary Genetics Analysis (MEGA) software to the analysis of gene families. This would involve development of new software for streamlining large gene family data acquisition, making MEGA cross-platform, and implementing efficient heuristics for estimating very large trees quickly and inferring gene duplication events and divergence times. Because sequence lengths in gene family alignments are biologically constrained, unlike species history analysis for which full genomes and multiple genes are often available to improve precision, we plan to evaluate the accuracies of phylogenetic trees produced by these extremely fast, but highly heuristic, algorithms for phylogenetic inference by means of computer simulation involving biologically realistic parameters. Insights gained from these efforts will be introduced in algorithm designs in MEGA. Overall, the software and research developments will contribute to advances in molecular evolution, bioinformatics, functional genomics, computational biology, and basic biomedicine. As always, MEGA and its source code will be made available free of charge for all uses, including research, education, and training.

Public Health Relevance

Evolutionary Bioinformatics is a powerful tool for conducting in silico functional analysis of DNA and protein sequences from genes and genomes of diverse organisms. The proposed software development and fundamental research will lead to an advanced Molecular Evolutionary Genetics Analysis (MEGA) tool for use by biologists in their quest to beter understand the evolutionary dynamics of gene families residing in the genomes of humans as well as their evolutionary relatives and pathogens.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG002096-12
Application #
8490404
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Brooks, Lisa
Project Start
2000-01-01
Project End
2014-06-30
Budget Start
2013-07-01
Budget End
2014-06-30
Support Year
12
Fiscal Year
2013
Total Cost
$253,075
Indirect Cost
$87,124
Name
Arizona State University-Tempe Campus
Department
Genetics
Type
Organized Research Units
DUNS #
943360412
City
Tempe
State
AZ
Country
United States
Zip Code
85287
Karim, Sajjad; NourEldin, Hend Fakhri; Abusamra, Heba et al. (2016) e-GRASP: an integrated evolutionary and GRASP resource for exploring disease associations. BMC Genomics 17:770
Liu, Li; Tamura, Koichiro; Sanderford, Maxwell et al. (2016) A Molecular Evolutionary Reference for the Human Variome. Mol Biol Evol 33:245-54
Battistuzzi, Fabia U; Billing-Ross, Paul; Murillo, Oscar et al. (2015) A Protocol for Diagnosing the Effect of Calibration Priors on Posterior Time Estimates: A Case Study for the Cambrian Explosion of Animal Phyla. Mol Biol Evol 32:1907-12
Hedges, S Blair; Marin, Julie; Suleski, Michael et al. (2015) Tree of life reveals clock-like speciation and diversification. Mol Biol Evol 32:835-45
Butler, Brandon M; Gerek, Z Nevin; Kumar, Sudhir et al. (2015) Conformational dynamics of nonsynonymous variants at protein interfaces reveals disease association. Proteins 83:428-35
Kumar, Avishek; Butler, Brandon M; Kumar, Sudhir et al. (2015) Integration of structural dynamics and molecular evolution via protein interaction networks: a new era in genomic medicine. Curr Opin Struct Biol 35:135-42
Miura, Sayaka; Tate, Stephanie; Kumar, Sudhir (2015) Using Disease-Associated Coding Sequence Variation to Investigate Functional Compensation by Human Paralogous Proteins. Evol Bioinform Online 11:245-51
Filipski, Alan; Tamura, Koichiro; Billing-Ross, Paul et al. (2015) Phylogenetic placement of metagenomic reads using the minimum evolution principle. BMC Genomics 16 Suppl 1:S13
Gerek, Nevin Z; Liu, Li; Gerold, Kristyn et al. (2015) Evolutionary Diagnosis of non-synonymous variants involved in differential drug response. BMC Med Genomics 8 Suppl 1:S6
Kumar, Sudhir; Ye, Jieping; Liu, Li (2014) Reply to: ""Proper reporting of predictor performance"". Nat Methods 11:781-2

Showing the most recent 10 out of 47 publications