This supplemental proposal knits together several bioinformatics visualization tools in the service of SARS-CoV-2 genome analysis. The core of the proposal is a newly-prototyped JavaScript viewer, ABrowse, that is capable of rendering multiple sequence alignments, navigable by phylogenetic trees, and integrated with protein structure views, all in a single embeddable component. The ABrowse viewer is currently employed to render the Pfam SARS-CoV-2 special release: a collection of 40 protein domains from the coronavirus genome, along with PDB structures. (ABrowse is also a candidate for Pfam's future default viewer, as noted in the letters of support.) We propose to accelerate ABrowse development for use by the COVID-19 pandemic, specifically targeting scaling, performance, and integration issues that are most relevant to scientists studying the virus. Chief amongst these is scaling ABrowse to handle millions of protein sequences (and/or SARS-CoV-2 genome sequences) by means of a new, compressed storage format suitable for random-access user-driven exploration of very large trees (and alignments) over the web. Beyond scaling, we also address integration, developing plugins for ABrowse to run within JBrowse (the genome browser that is the focus of the project to which this is a supplemental proposal) as well as Auspice (the web dashboard of NextStrain, the phylogenetic genome alignment and annotation package that is widely used for COVID-19 analysis). We also propose several user interface enhancements to make ABrowse more useful as a navigation tool for COVID-19 data.
We develop a new web application for integrative browsing of SARS-CoV-2 genome sequences, protein alignments, structures, and phylogenetic trees. The web app will scale to millions of genome-length sequences and will be integrated with the JBrowse genome browser and the NextStrain pathogen visualization platform. The app is built exclusively using dynamic HTML and JavaScript, so that it can be served from cloud storage with negligible CPU load.
Showing the most recent 10 out of 13 publications