This grants supports an upgrade of the computer facilities now maintained by the Alliance for Computational Earth Science (ACES) at MIT. ACES manages cluster computing resources for three academic departments spanning two schools at MIT. Approximately 1/3 of the processors are in the geophysics subcluster, ACES-GEO. ACES-GEO Pis will purchase a pre-assembled cluster machine from THINKMATE which would include a controlling node, at least 11 compute nodes and a switch unit. Node configurations will include multi Quad-core Intel CPU, massive RAM and storage capacities. The new cluster will support computationally intensive PI and student research involving geodynamical modeling of Earth and Planetary interiors, geodetic data analysis, cryosphere dynamical modeling, seismic imaging and interpretation, land form evolution, and inversion of remnant magnetization. The cluster will support the computational needs of four assistant professors, two of whom are women.
A broad group of MIT researchers have formed a coalition, called Alliance for Computational Earth Science (ACES) at MIT, that focuses on developing and deploying advanced computational technologies to address challenging problems of Earth science. The EAR-funded subset of these researchers acquired $75,000 under this grant to upgrade the then existing ACES cluster. Funds from this grant were leveraged with ~ $225k of funding from several other investigators and from MIT institutional funds. These funds were used to purchase and assemble an Infiniband-interconnected cluster with 456 Intel Westmere cores, a total of 2TB memory, and 200TB of shared storage. Serious issues facing cluster computing at MIT and elsewhere include the need for precious space with substantial cooling capacity and power. To solve these problems, MIT has developed and operates a shared, purpose built computer facility off campus in a facility that was built previously to house a particle accelerator. A 10 gigabit ethernet network connects the facility (including our system) to the main campus. The system supports research computing for about 50 users, using a queue system to manage access to resources. To date almost 50,000 jobs have been run on the system, with typical job sizes ranging from 6 to 200 cores. This readily accessible high performance computing facility is beginning to enable a broad portfolio of research, with funding from NSF, other governmental agencies, and industry. Because the cluster has been running at full capacity for less than a year, the facility has not yet had time to result in many publications. Important findings will be forthcoming shortly. Intellectual Merit: The upgraded ACES cluster at MIT is allowing the application of advanced computing technologies to solve a wide variety of Earth science problems. The areas of research being carried out on the cluster include geodynamical modeling of both the Earth and other bodies in various evolutionary stages, analysis geodetic data, ice sheet models (and other dynamic systems) driven by large data sets, seismic imaging and interpretation, land form evolution, and inversion of remnant magnetization. The participants interact and exchange ideas across a broad set of topics and among the new developments in computational methods, modeling, and data analysis to be explored and implemented. Broader impacts: The upgraded cluster is being used for educating undergraduate and graduate students in the rapidly developing field of computational science, a matter of national interest. The coinvestigators on this proposal have an excellent track record of educating both undergraduate and graduate students. Most of our undergraduate majors and about half of our graduate students are women. The ACES upgrade is providing these young investigators the hardware and software necessary to produce more complete and comprehensive models and simulations in vastly shorter time frames. Additionally, by leveraging the experience and knowledge of existing ACES investigators in high performance computing and parallel programming, these new investigators enjoy additional productivity gains. The software to be developed on the ACES cluster will be open-source or freely available to academic researchers. The core of the queuing, scheduling and check-pointing software is also open source. The developments made on ACES will be readily accessible to other investigators including those using somewhat smaller clusters. Societally important problems are being addressed using the ACES system. Problems addressed include estimating the state of the sea surface and how it responds to climate change. Understanding volcanic and magmatic systems can have a direct societal impact in volcanically active regions, as does the more detailed understanding of plate movements, seismicity (including induced seismicity). The elements needed to understand these classes of problems are included in the ACES science section. All these science elements are already developed to the point that they will, given the availability of the upgraded ACES computer system, produce the next generation of optimally detailed and significant results.