Phylogenetic analysis is crucial to a wide range of biological and medical research. A new type of data based on gene order and gene content within whole genomes has attracted increasing interest from researchers in the past several years. ? ? Specific Aims: The complexity of genome evolution poses many exciting challenges to developers of mathematical models and algorithms. However, the current tools can only be applied to small genomes (such as organelle genomes) evolving via very simple rearrangements events, hence their breadth of usage is limited. We will address these problems by: (1) mathematical modeling and theoretical analysis of complex evolutionary events such as gene duplication and loss; (2) algorithm design and implementation for phylogenetics and gene order reconstruction (3) performance assessment of these new algorithms through extensive testing on simulated and biological datasets; (4) high-performance implementation of the algorithms using algorithm engineering techniques and a flexible approach to parallelization. ? ? Contributions and Broader Impact: The broader impacts of the proposed project are several. (1) The development of new theories and algorithms for the efficient reconstruction of phylogenies and inference of ancestral genomes based on complex genome rearrangements will considerably enlarge the scope of research in the field and give rise to interesting new problems in mathematical and computational biology. (2) Efficient and accurate software for phylogenetic analysis and genome comparison, tested on a large variety of real datasets and on an extensive range of simulations, is expected to reveal new evolutionary patterns and to enable the investigation of novel biological questions. (3) A web server hosted by our group (or by our collaborators) will enable biologists to submit their datasets through a user-friendly web interface and get results back within reasonable amount of time, without the burden of installation and learning parallel computation. (4) The project team combines expertise in mathematic modeling, algorithm design, high-performance computing, comparative genomics, and phylogenetics. Students (both undergraduate and graduate) and postdocs on this project will receive valuable interdisciplinary training experience. (5) Both universities have established programs to boost research in computational biology. This project will enable the PIs to establish close interdisciplinary collaborations among departments from both universities and recruit graduate students (especially minorities) to this fast-growing research field. ? ? ? ?
Showing the most recent 10 out of 26 publications