As the conduct of scientific and engineering research and the provisioning of cyberinfrastructure (CI) to support that research evolve and mature, new opportunities for leverage and technology sharing and codevelopment to provide increasing levels of support and services to the computationally-enabled research community are emerging. TeraGrid (TG) and Open Science Grid (OSG) have been working over the past year to identify such opportunities, and this project (ExTENCI: Extending science Through Enhanced National CyberInfrastructure) will in part advance the support and capabilities provided to a set of key research activities that represent pathfinders in their respective communities in harnessing CI resources and services.

ExTENCI will work with several representative research applications for which a limited but sustained CI effort will significantly increase the science they deliver. These applications span a range from large collaborative science to small research groups and include earthquake engineering, biology, protein structure, physics, and environmental studies. The team will work with specialists from these research areas to exploit enhancements made in four technology areas: (1) Workflow and client tools that permit an application to exploit either TeraGrid and OSG resources, to use both simultaneously or to utilize new resources (e.g., clouds); (2) Distributed file systems operating across wide area networks that simplify access to and delivery of data; (3) Virtual machine technologies that can hide the complexity of application environments and allow them to run in developing environments such as clouds; (4) New job submission paradigms that utilize distributed grid resources more efficiently.

One important goal of ExTENCI will enable the Southern California Earthquake Center (SCEC) to advance their hazard prediction curves by providing extended workflow capabilities and a new distributed storage solution, to support data transfer and to allow integrated use of OSG and TeraGrid resources appropriate for each part of the workflow. The extended workflow capabilities will also permit protein 3-D structures to be determined for significantly longer amino acid sequences. The distributed storage mechanism will also significantly improve access by U.S. universities to Large Hadron Collider data and will simplify the sharing of simulated data by institutions participating in the Lattice Quantum Chromodynamics (LQCD) project. Additionally, ExTENCI's goal is to provide improved Virtual Machine (VM) and job submission capabilities to extend the range of CI resources available to applications as diverse as experiments at the Relativistic Heavy Ion Collider (RHIC) and oil reservoir simulations dependent on the Ensemble Kalman Filter algorithm.

Project Report

Summary The national Cyberinfrastructure (CI) consists of a nationwide network of high performance supercomputers originally provided by TeraGrid (TG) and now provided by the Extreme Science and Engineering Discovery Environment (XSEDE) and high throughput (large numbers of powerful microcomputer resources) provided by the Open Science Grid (OSG). The CI computers are located at universities and national laboratories. There are also experimental and commercial "cloud" computers that run applications embedded in Virtual Machines that contain the entire software system (e.g. Microsoft Windows or Linux). The CI is funded by the NSF, Department of Energy, and universities. Scientists, researchers, and engineers use the CI to do research to solve scientific and engineering problems using simulations. They also study and analyze data from experiments (such as from the Large Hadron Collider - LHC) and satellites (such as weather, ocean, land and climate data). ExTENCI’s primary goal was to develop and provide production quality enhancements to the CI to enable specific science applications to more easily use both the OSG and TG/XSEDE or broaden access to a capability to both TG/XSEDE and OSG users. ExTENCI made enhancements in four areas: 1) Workflow and client tools that permit a software application to use OSG, XSEDE and cloud resources; 2) Distributed file systems that provide easy and fast access to scientific data across a wide area network connecting universities and national laboratories; 3) Virtual Machine technologies that hide complexity of applications and allow them to run on different cloud environments; and 4) Job submission paradigms that distribute work to computing resources. Eight institutions provided the ExTENCI project staff. Intellectual Merit ExTENCI enhanced the capability and expanded use in four areas and five tools in the CI: 1) Workflow and Client Tools Swift – a parallel scripting language for running applications on the CI – was enhanced to: 1) dispatch work to either XSEDE or OSG; 2) increase the capacity to handle a larger number of simultaneous jobs; 3) take advantage of OSG’s glidein pilot job system, and 4) make use of OSG’s data services. 2) Distributed File Systems Lustre/WAN – a filesystem that provides transparent access to data located at another location – was enhanced by adding Kerberos computer network user authentication security. ExTENCI set up hardware and software to support Lustre/WAN services at five sites and installed Lustre servers at two sites. Performance analysis/tuning was completed to enable CMS (LHC high energy physics experiment) science applications at two Florida Universities (FIU and FSU) to run on data from the University of Florida and Fermilab. Three major science applications were tested in this environment. 3) Virtual Machines (VMs) and Cloud Technologies HTCondor – a specialized workload management system for compute-intensive jobs and the basis of the OSG - was improved to better support cloud services. ExTENCI created new tools for creating, managing, distributing VMs to clouds; invented a Cloud Dashboard that enables users to start, interact with, and stop VMs via a browser; and created two new cloud services. 4) Job Submission Paradigms SAGA/BigJob – a standardized language for defining and running applications on the CI – was enhanced by designing and implementing an HTCondor adaptor to provide an interface to OSG thereby enabling SAGA and BigJob users to send application jobs to either XSEDE or OSG resources. BigJob was made more reliable and provided with a Gateway to enable scientists to set up pilot-based application runs via a browser interface. Broad Impact ExTENCI produced enhancements to the CI that were used by researchers in the science areas of glass material modeling, protein modeling, theoretical chemistry, earth systems science, molecular biophysics, high energy physics, and sociology of science. This science work, submitted using these enhanced tools, consumed about 7 million hours of computer time on the CI. Eight scientific papers were published and over thirty presentations and papers were delivered at conferences. A PhD student used these tools to do the science behind his successful thesis. A female PhD student and a female MS student are working on their theses. User guides were enhanced or written and tutorials were delivered to train people on these technologies. Ten graduate students supplemented their training by working on ExTENCI. The new software and facilities provided by ExTENCI will remain to serve scientists in the future.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Application #
1007115
Program Officer
Barry I. Schneider
Project Start
Project End
Budget Start
2010-08-01
Budget End
2013-07-31
Support Year
Fiscal Year
2010
Total Cost
$2,143,231
Indirect Cost
Name
University of Florida
Department
Type
DUNS #
City
Gainesville
State
FL
Country
United States
Zip Code
32611