This award supports a research program to understand and to address the challenges associated to scaling specific gravitational wave physics applications from a prototype cluster of 16 Sony PlayStation 3 (PS3) gaming consoles to the largest system now available, that built by the Air Force Research Laboratory (AFRL) in Rome, NY. This system, named AFRL CONDOR, makes use of 1,716 PS3s alongside traditional servers and Nvidia CUDA GPUs to achieve 500 TFLOPS of computing power. The PI has unrestricted access to this large system through a recently established CRADA agreement with AFRL. One of the specific gravitational wave physics applications that will be targeted in this work is one that models the process of the capture of a small (say, solar-mass) black hole by a supermassive black hole, an important problem in the area of theoretical gravitational wave physics, and a potential source of gravitational waves for space-based detectors.

There is considerable current interest in harnessing the power of video gaming technology for scientific high-performance computing. The PI's prototype cluster was used successfully for scientific computation and demonstrated order-of-magnitude gains in metrics such as performance-per-dollar and performance-per-Watt as compared with traditional CPU-based clusters. Successful scaling up of the applications used for this project will immediately impact the gravitational wave science that these applications enable. In addition, the lessons learned and the experience gained associated to achieving good scaling on a large system like AFRL CONDOR will be extremely valuable and likely applicable to other problems and systems. Parallelism or optimization approaches that may be developed through this project may also find applicability in other problems and areas. The outcomes and results will be published in research journals and conferences and also made openly available though the PI's research website. In addition, this is a project that would be very attractive to both physics and engineering students. The supported graduate student will learn about various aspects of supercomputing, including how to approach and address challenges related to scaling on large supercomputers like AFRL CONDOR.

Project Report

This research project was about evaluating the potential of low-cost video-gaming technologies to perform scientific calculations, at a very large scale. More specifically, the focus of the research efforts in this work was on understanding and addressing the challenges associated to scaling a number of black hole astrophysics calculations from a small sized (16 Sony PS3s) cluster to the Air Force's CONDOR system that is hundred times larger. The lessons learned and the experience gained associated to achieving good performance on a large system like AFRL CONDOR have been extremely valuable and likely applicable to other problems and systems as well. The main outcome of these research efforts has been very positive. Broadly speaking, we have been able to demonstrate that it is possible to achieve excellent scaling on a petascale class system like AFRL CONDOR; however, that does involve a careful and deep re-thinking of the approach towards parallelism and performance scaling. In addition to the experiences related to achieving high performance on a novel petascale class system, the project also enabled the advancement of a number of scientific projects in the area of black hole physics. More details are available in the eight (8) technical publications that resulted from this work. This broad outcome is likely to positively impact almost any computational science research. In the long term, this project is likely to have a strong impact on various areas of computational science. Several projects may migrate from traditional clusters to clusters employing gaming technologies owing to their extremely high cost effectiveness. This would open up the possibility of significant cost savings in computational science budgets and the possibility of being able to perform larger simulations that were not possible before. The novel parallelism and optimization approaches that were developed as a result of this work, may also find applicability in other problems and areas. Eventually, the effort and experience is likely to benefit many other areas of computational science and engineering. It is also worth commenting on the fact that this project turned out to be very attractive to both physics and engineering students. The budget did include support for graduate students, therefore the project work involved student training. The supported students learned about various important aspects of supercomputing, including how to approach and address challenges related to scaling on large supercomputers like AFRL CONDOR. Lastly, because our research activities contributed to the development of broad mathematical and computational tools, and also to the education and training of students, this project's impact will likely extend beyond its research area and thus potentially benefit other scientific and engineering disciplines. The mathematical and computational skills developed by the supported students will open them up to possible employment to a wide variety of technical positions, including those associated with great current national need.

Agency
National Science Foundation (NSF)
Institute
Division of Physics (PHY)
Type
Standard Grant (Standard)
Application #
1135664
Program Officer
Bogdan Mihaila
Project Start
Project End
Budget Start
2011-09-01
Budget End
2013-08-31
Support Year
Fiscal Year
2011
Total Cost
$48,798
Indirect Cost
Name
University of Massachusetts, Dartmouth
Department
Type
DUNS #
City
North Dartmouth
State
MA
Country
United States
Zip Code
02747