This project will develop a rich and flexible mathematical framework for quantifying the phylogenetic signal (i.e., the information about genealogical relationships among organisms) contained within big data sets of genome sequences. To quantify this signal, the relationships among different phylogenetic trees, or the branches in those trees, will be conceptualized as networks. The structure of these networks provides biologically meaningful information about the degree and causes of phylogenetic uncertainty and conflict. The newly developed framework will be implemented in publicly available software useable as a standalone application and as a webservice. In addition, the core functionality of the developed software will be closely integrated with other ongoing NSF-funded efforts to reconstruct and explore the Tree of Life (iPlant and the Open Tree of Life).
This work will draw from expertise in evolutionary biology, computer science, and applied mathematics. Research activities will involve the direct contributions of both graduate and undergraduate researchers and software developers. Undergraduates will learn to code from graduate students in mentor-apprentice relationships. An undergraduate science communications student will develop a website illustrating phylogenetic principles and applied phylogenetic research, highlighting novel insights provided by the methods developed. Additionally, several in-person and remote training workshops will be held to introduce and familiarize researchers with newly developed tools. An ongoing computational biology seminar series targeted towards undergraduates will be held at Louisiana State University. The methods developed as part of this project will improve our ability to understand phylogenetic information, which is integral to a variety of applications in conservation, forensics, agriculture, and epidemiology. Updates on project progress and links to developed tools will be available at www.phyleauxgenetics.org/.