"This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5)."

Dependable and effective access to large amounts of sustained computing power is strategic to scientific discovery in a vast range of disciplines. An ever expanding number of science communities worldwide rely on Condor software tools to harness the power of shared computing resources. Closer to home, over 100 US universities have installed these tools on their campus, and downloads of the software surpass 2,000 per month. From fulfilling a strategic role of managing high-energy physics computing resources at Brookhaven and Fermi National Labs to enabling research computing at smaller under-represented institutions, Condor technology represents the culmination of 20 years of distributed computing research. The efforts outlined in this proposal allow the research contributions of this software to continue, and strengthen the momentum to enable academia and industry to leverage the power of effective distributed computing.

The intellectual merit of the proposed effort lies within the novelty of the distributed mechanisms implemented and the software engineering challenges the team faces in developing, maintaining and supporting nearly one million lines of code in an academic setting. As a leading provider of open-source distributed computing capabilities, the software plays an important role as a primary ?building block? in many campus grids as well as community, national and international cyber-infrastructure initiatives. The size, complexity, diversity and quality assurance requirements of the software also lead to challenges in the area of software engineering and integration. Daily builds and testing, version control, software tracking and distributed troubleshooting of multi-layer software stacks are examples of the ongoing challenges.

The broader impacts of this effort are interdisciplinary and span both academia and industry. Researchers who use Condor tools are able to greatly increase their computing throughput, and consequently increase the size and complexity of the problems they study. The software is widely used in many compute-intensive disciplines, including Art, Biotechnology, Economics, Chemistry, Medical, High Energy and Nuclear Physics, and Computer Science, and has been adopted by both universities and government labs. Furthermore, the software plays a critical role by managing the majority of the compute resources for the US Large Hadron Collider experiments, as well as serving 250 LIGO scientists studying gravitational waves. Condor technology fuels new commercial offerings, such as Red Hat's MRG and Cycle Computing's CycleServer in the past year. These technologies are also used for mission-critical tasks in industry; e.g., Micron Technology uses Condor on 15,000 CPUs worldwide to validate memory chip manufacturing, J.P. Morgan uses Condor on 5,000 CPUs for real-time financial portfolio analysis, and Disney's fully animated film "The Wild" used Condor to render more than 70 million frames. Because the high throughput computing capabilities we offer effectively harness even desktop machines running Windows, it enables students and scientists at underrepresented institutions to pursue research and instruction requiring considerable computing in the absence of access to traditional high-end computing facilities. This proposal directly funds five students, and through GLOW it will provide an excellent training environment to new generations of interdisciplinary professionals.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Type
Standard Grant (Standard)
Application #
0850745
Program Officer
Kevin L. Thompson
Project Start
Project End
Budget Start
2009-06-01
Budget End
2013-05-31
Support Year
Fiscal Year
2008
Total Cost
$2,700,000
Indirect Cost
Name
University of Wisconsin Madison
Department
Type
DUNS #
City
Madison
State
WI
Country
United States
Zip Code
53715