HyPhy (://www.hyphy.org) is a scriptable software platform designed to enable flexible and powerful analyses of DNA, RNA, codon, amino acid and other types of sequence data in an evolutionary context. Since their initial formal release in 2005, HyPhy and it's free web server Data Monkey (://www.datamonkey.org) have become stable and mature, cited in over 2,000 peer-reviewed publications, described in multiple book chapters, and been the subject of many invited workshops. In the first four-year funding cycle we focused on improvements and hardening of the back end elements of the HyPhy system. In this proposal, while we will still do work at that level, we focus more on the front end where interactions with users are key.
Aim 1 : Re-architect the HyPhy analytical engine and its scripting language 1.1 Redesign the HyPhy Batch Language (HBL) to enhance its productivity, reliability, maintain- ability, portability and reusability; this will be done while maintaining backward compatibility. 1.2 Redevelop the standard library of evolutionary models, standard analyses, and inference procedures to make them self-documenting, easy to learn, easy to extend, robust to inadvertent misapplication, and compliant with data exchange formats and communication protocols.
Aim 2 : Models and algorithms for large and complex datasets. 2.1 Improve HyPhy performance to handle much larger data sets in a single analysis by accelerating the fundamental operations in hardware and software. 2.2 Allow users to combine sequence and other quantitative data in a likelihood framework. 2.3 Develop a library of analyses for rapidly evolving pathogens and immune repertoires.
Aim 3 : Web browser based graphical interface for all computing devices. Presently, the majority of HyPhy users interact with the program through www.datamonkey.org. Keeping in mind the demand for such a user experience, we will: 3.1 Implement a complete interface for data exploration, analysis definition, job execution, and result interpretation and visualization as a local web application. This interface will run on computers, tables, and smartphones. 3.2 Develop the computational core of HyPhy as a native browser plug-in, making HyPhy an app distributed, maintained, and run entirely in a browser. 3.3 Re-implement datamonkey.org using modern web technologies (node.js); provide a public instance which can be accessed from any instance of HyPhy, and a distribution that can allow labs to set up their own cluster- or cloud-based instances.

Public Health Relevance

Molecular evolutionary analyses are central to many aspects of basic, translational, and applied biomedical research. Examples include identifying mutations that allow pathogens to evade the immune response; prediction of the structure and function of proteins; estimating the evolutionary relatedness of human or other populations; characterizing the magnitude of selective pressure, either natural or artificial, on genes or species. The HyPhy software platform provides both a programming platform to make methodological development faster, and also a very large collection of computational tools for end users.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Temple University
Schools of Arts and Sciences
United States
Zip Code
Spielman, Stephanie J; Kosakovsky Pond, Sergei L (2018) Relative evolutionary rates in proteins are largely insensitive to the substitution model. Mol Biol Evol :
Pacheco, M AndreĆ­na; Matta, Nubia E; Valkiunas, Gediminas et al. (2018) Mode and Rate of Evolution of Haemosporidian Mitochondrial Genomes: Timing the Radiation of Avian Parasites. Mol Biol Evol 35:383-403
Shank, Stephen D; Weaver, Steven; Kosakovsky Pond, Sergei L (2018) phylotree.js - a JavaScript library for application development and interactive data visualization in phylogenetics. BMC Bioinformatics 19:276
Rife Magalis, Brittany; Kosakovsky Pond, Sergei L; Summers, Michael F et al. (2018) Evaluation of global HIV/SIV envelope gp120 RNA structure and evolution within and among infected hosts. Virus Evol 4:vey018
Weaver, Steven; Shank, Stephen D; Spielman, Stephanie J et al. (2018) Datamonkey 2.0: a modern web application for characterizing selective and other evolutionary processes. Mol Biol Evol :
Ragonnet-Cronin, Manon; Jackson, Celia; Bradley-Stewart, Amanda et al. (2018) Recent and Rapid Transmission of HIV Among People Who Inject Drugs in Scotland Revealed Through Phylogenetic Analysis. J Infect Dis 217:1875-1882
Spielman, Stephanie J; Kosakovsky Pond, Sergei L (2018) Relative evolutionary rate inference in HyPhy with LEISR. PeerJ 6:e4339
Kosakovsky Pond, Sergei L; Weaver, Steven; Leigh Brown, Andrew J et al. (2018) HIV-TRACE (TRAnsmission Cluster Engine): a Tool for Large Scale Molecular Epidemiology of HIV-1 and Other Rapidly Evolving Pathogens. Mol Biol Evol 35:1812-1819
Frost, Simon D W; Magalis, Brittany Rife; Kosakovsky Pond, Sergei L (2018) Neutral Theory and Rapidly Evolving Viral Pathogens. Mol Biol Evol 35:1348-1354
Forrester, Naomi L; Wertheim, Joel O; Dugan, Vivian G et al. (2017) Evolution and spread of Venezuelan equine encephalitis complex alphavirus in the Americas. PLoS Negl Trop Dis 11:e0005693

Showing the most recent 10 out of 67 publications