Collaborative Research: Applying Hardware-Inspired Methods for Multi- Core Software Design Brian C. Demsky University of California, Irvine

0725357 Collaborative Research: Applying Hardware-Inspired Methods for Multi- Core Software Design Michael B. Taylor University of California, San Diego

In the past, improvements in microprocessor capabilities were expressed largely through a combination of clock frequency increases and microarchitectural enhancements that were invisible to the typical developer. More recently, due to power and microarchitectural scalability issues, microprocessor designs have diverged from this path and have begun to focus on exposing improved semiconductor process capabilities through the multi-core abstraction, which integrates multiple independent processors into a single chip. The deployment of such explicitly-parallel multi-core processors has deep implications on the future of software systems. While parallel software has been largely unnecessary in desktop systems, it will become essential if we are to expect continued increases in software functionality and programmer productivity like those that society has enjoyed over the last 35 years.

This research investigates a new design methodology for developing the parallel software systems that are necessary to take advantage of multi-core processors. This methodology leverages concepts from hardware chip-design methodologies, which scale to millions of communicating parallel entities. This new design process enables the software developer to create flexible system designs that easily accommodate refinement of how the computation is realized. It does this by separating the functional design of the software system from the specification of how to organize the computation. To validate this new design methodology, the research project investigates the construction of synthesis and profiling tools that can be used to develop and refine these functional and organizational specifications. These specifications are in turn used to create an executable that is optimized for the specific multi-core microprocessor.

Project Report

As multicore processors enter mainstream computing, program parallelization is getting more attention. Although many tools have been proposed to help programmers parallelize their code, one important question has been largely overlooked: "which parts of the program should I spend time parallelizing?" Profilers such as gprof provide a solution to a similar problem in the domain of serial optimization. A typical profiler produces a list of regions ordered by their work coverage (we call it a "plan") as a region with larger work is likely to bring higher benefit from optimization, guiding to more effective optimization activities. In this work, we created a prototype for Kremlin, a profiling tool for par- allelization. Kremlin adopts the time-tested gprof usage model. Whereas gprof uses the work coverage of a region as a metric, Kremlin uses the program speedup as a metric. According to Moore’s law, we can calculate the program speedup upon a region parallelization if region-localized parallelism and work coverage of the region is known. Region-localized parallelism represents the parallelism available in a region excluding the parallelism originated from its subregions. The major challenge in the design of Kremlin is the lack of a technique to localize parallelism to a region. Conventional critical path analysis (CPA) termine the overall parallelism in a program, but it cannot localize the parallelism to a specific region. Kremlin overcomes this prob- lem by employing a new technique called hierarchical critical path analysis (HCPA). Based on the region hierarchy of a program and CPA results for each region, HCPA extracts region-localized parallelism. From our preliminary evaluation with NAS Parallel Bench (NPB), a parallelization guided by Kremlin achieves a com- parable speedup to manual parallelization with a smaller number of parallelized regions, effectively reducing a programmer’s efforts without sacrificing the quality of parallelization. A user study also indicates parallelization can be more effective with Kremlin.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Type
Standard Grant (Standard)
Application #
0725357
Program Officer
Sol J. Greenspan
Project Start
Project End
Budget Start
2007-10-01
Budget End
2010-09-30
Support Year
Fiscal Year
2007
Total Cost
$462,494
Indirect Cost
Name
University of California San Diego
Department
Type
DUNS #
City
La Jolla
State
CA
Country
United States
Zip Code
92093