Whole-genome and multi-species-genome segment sequencing projects are, along with re-sequencing efforts, producing increasingly larger datasets for understanding evolutionary dynamics of mutations, genes, and genomes. The need for effective analysis of these datasets that contain a large number of genes and species has expanded the scope of evolutionary bioinformatics studies from specialist bioinformaticians to basic and applied biomedical researchers at the forefront of laboratory sciences. Therefore, we propose an integrated research and programming project with an aim to provide extensible software with facilities for (a) high-throughput application of the same data analysis for different genes, domains, genomic segments, and groups of sequences using the new IterationExpert, (b) employing sophisticated computational tools in the familiar MEGA platform by linking applications using the new AppLinker, (c) visualizing differences in natural selection among positions in a protein-structural context using the StructTracer, and (d) conducting extensive analysis in order to [i] infer evolutionary history of sequences from populations, species, and gene families; [ii] estimate the confidence intervals for times of species divergence and gene duplication events; [iii] deduce tracks of adaptive evolution in proteins, genes, and codons; [iv] test alternative evolutionary hypothesis; and [v] find the most appropriate model of molecular evolution in genes and lineages. Building on the successes of our previous software, we plan to add these new facilities for exploration and analysis of DMA and protein sequences in MEGA. In addition, we plan to tackle methodological challenges posed by the need to infer phylogenetic trees for large numbers of sequences and many genes by using theoretical and empirical data analysis with a focus on investigating the accuracy of different ways of combining data from multiple genes, assessing the performance of computationally-feasible methods under different optimality criteria for increasing number of sequences, and developing novel methods and algorithms. Outcomes from these investigations will guide the incorporation of the next set of methods and algorithms for phylogenetic inference in MEGA. These software and research developments will contribute to advances in molecular evolution, bioinformatics, functional genomics, computational biology, and basic biomedicine. As always, MEGA will be made available free of charge for all uses, including research, education, and training. ? ? ?
Showing the most recent 10 out of 48 publications