Globus Genomics has been developed at the Computation Institute, University of Chicago as an advanced genomics analysis platform running as a Software-as-a-Service on Amazon Web Services, powered by Globus and Galaxy. It was developed to meet the needs of both researchers and core lab providers who require a high-quality service with state-of-the-art capabilities to help streamline data movement, simplify the creation of genomics analysis pipelines, automate the execution of those pipelines, and run analysis at very large scale on elastic compute infrastructure. Globus Genomics, under development for three years, has been used extensively by researchers at leading institutions, including University of Washington, University of Chicago, Washington University St. Louis, Georgetown University, and Johns Hopkins. There is significant potential and demand to expand use of the service to meet the rapidly growing genomics analysis needs of both existing users and large communities of new users. We now propose work that will amplify the utility and impact of Globus Genomics by providing (1) scalability to 1000s of simultaneous analyses by 1000s of users, (2) support for state-of-art high performance workflows and tools, including large-scale imputation analysis and consensus calling on structural variants, (3) automated cost and performance optimization to slash cloud computing costs and turnaround times, and (4) powerful dashboards for end-to-end and summary views of large-scale analyses. These enhancements will be enabled by development in the following key areas: enhancing and extending the Globus Genomics computational framework to enable high-performance reliable execution of standard and novel NGS analysis workflows on large and extremely large datasets; creating and maintaining state-of-the-art pipelines for variant calling, whole genome analysis, RNASeq and ChipSeq, which involves computationally profiling the latest versions of tools and understanding different computational modalities for optimal execution on Amazon Web Services; creating a profiling and optimization framework to enable automated, cost- and/or time-optimal configuration of NGS applications and workflows on large cloud systems; and creating an automatic computational provisioning framework. The grant award would allow us to address key needs of current and prospective users and thus to provide an important bioinformatics platform to researchers who otherwise could not easily access such capabilities.

Public Health Relevance

More than 300 researchers across 25 universities and research organizations, in such fields as neurodevelopmental disorders, cancer, diabetes, and cardiovascular disease, have leveraged Globus Genomics to analyze multiple terabytes of sequence data. We propose here to expand the performance and utility of this software system by creating optimzed, scalable analysis pipelines for exome, whole genome and RNASeq datasets, scale up the analysis by adopting best of the breed computational technologies, and validate our efforts by working closely with our existing users.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG009018-03
Application #
9533667
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Di Francesco, Valentina
Project Start
2016-09-28
Project End
2020-07-31
Budget Start
2018-08-01
Budget End
2019-07-31
Support Year
3
Fiscal Year
2018
Total Cost
Indirect Cost
Name
University of Chicago
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
005421136
City
Chicago
State
IL
Country
United States
Zip Code
60637
Jagodnik, Kathleen M; Koplev, Simon; Jenkins, Sherry L et al. (2017) Developing a framework for digital objects in the Big Data to Knowledge (BD2K) commons: Report from the Commons Framework Pilots workshop. J Biomed Inform 71:49-57
Al-Khersan, Hasenin; Shah, Kaanan P; Jung, Segun C et al. (2017) A novel MERTK mutation causing retinitis pigmentosa. Graefes Arch Clin Exp Ophthalmol 255:1613-1619