Biomedical research has been rapidly transformed into an informatics intensive discipline. This has created challenges at many levels, from the availability of computational infrastructure and expertise, the burden of keeping up with rapidly developing tools and best-practices, communication difficulties between experimentalists and computational researchers, and difficulties ensuring reproducibility. Over the last six years we have developed an open-source software framework, Galaxy (, to address these issues. Galaxy provides an accessible analysis environment allowing experimentalists to use cuttingedge tools on large datasets, with automated tracking to ensure reproducibility. Galaxy makes it easy for tool developers to quickly put their tools into experimentalist's hands. Galaxy has become an indispensable resource for the genomic research community. First, for the thousands of experimentalists using Galaxy's tools in their research (as evidenced in many publications). Beyond that, Galaxy has been adopted as the local analysis infrastructure for many dozens of labs and institutes. Galaxy is flexible enough to be deployed on a variety of different compute resources, particularly important as data-production is increasingly de-centralized. At Galaxy's core is a powerful extensible framework that other important community resource projects are now integrating or building on. Thus Galaxy is ideally positioned to become a substrate for sharing and communicating analysis. We propose to expand the Galaxy resource with novel approaches for accessible, transparent, and reproducible analysis in a decentralized world. Driven by biological projects, we will build best practice workflows for several sequencing based experiments. We will create innovative methods to automate packaging and deploying analysis tools. We will build and maintain the Galaxy Tool Shed, a hub for sharing tools, best-practice workflows, and analysis strategies. We will develop a novel approach for publishing analysis. We will create a framework for visual analytics leveraging existing Galaxy Tools. Finally, we will build a complete solution for managing sequencing workflows including sample tracking and instrument integration.

Public Health Relevance

Rapid proliferation of genomic approaches is revolutionizing medical field by creating novel diagnostic applications. This project will make cutting edge biomedical analysis tools available to every clinical researcher fulfilling the translation promise of sequencing technologies.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Biotechnology Resource Cooperative Agreements (U41)
Project #
Application #
Study Section
Special Emphasis Panel (ZHG1-HGR-M (O2))
Program Officer
Bonazzi, Vivien
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Pennsylvania State University
Schools of Arts and Sciences
University Park
United States
Zip Code
Grüning, Björn A; Rasche, Eric; Rebolledo-Jaramillo, Boris et al. (2017) Jupyter and Galaxy: Easing entry barriers into complex data analyses for biomedical researchers. PLoS Comput Biol 13:e1005425
Børnich, Claus; Grytten, Ivar; Hovig, Eivind et al. (2016) Galaxy Portal: interacting with the galaxy platform through mobile devices. Bioinformatics 32:1743-5
Stoler, Nicholas; Arbeithuber, Barbara; Guiblet, Wilfried et al. (2016) Streamlined analysis of duplex sequencing data with Du Novo. Genome Biol 17:180
Afgan, Enis; Baker, Dannon; van den Beek, Marius et al. (2016) The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res 44:W3-W10
Goecks, Jeremy; El-Rayes, Bassel F; Maithel, Shishir K et al. (2015) Open pipelines for integrated tumor genome profiles reveal differences between pancreatic cancer tumors and cell lines. Cancer Med 4:392-403
Harris, Nomi L; Cock, Peter J A; Chapman, Brad A et al. (2015) The Bioinformatics Open Source Conference (BOSC) 2013. Bioinformatics 31:299-300
Blankenberg, Daniel; Taylor, James; Nekrutenko, Anton (2015) Online resources for genomic analysis using high-throughput sequencing. Cold Spring Harb Protoc 2015:324-35
Budd, Aidan; Corpas, Manuel; Brazas, Michelle D et al. (2015) A quick guide for building a successful bioinformatics community. PLoS Comput Biol 11:e1003972
Leo, Simone; Pireddu, Luca; Cuccuru, Gianmauro et al. (2014) BioBlend.objects: metacomputing with Galaxy. Bioinformatics 30:2816-7
Blankenberg, Daniel; Von Kuster, Gregory; Bouvier, Emil et al. (2014) Dissemination of scientific software with Galaxy ToolShed. Genome Biol 15:403

Showing the most recent 10 out of 24 publications