Chip multiprocessor (CMP) systems, which provide multiple processors on a single chip, also known as multicore systems, have displaced single-processor architectures as the de facto standard model for computing platforms. This change is due to the fact that the CMPs offer superior performance and power efficiency, compared to the traditional designs. An emerging feature of the CMP era is the deployment of several different types of processing elements on the same platform, with varying computation speed and power consumption characteristics. An additional complicating factor is that a trend toward overprovisioned designs, where only a subset of the available cores can be active at any time, due to power and thermal constraints. Such heterogeneous CMPs are increasingly being deployed in systems where applications with different safety assurance (dependability) and timeliness requirements must co-exist on the same CMP. Hence, there is a growing need for an integrated framework to allocate heterogeneous hardware resources of a CMP among applications in a way that makes efficient use of the resources while assuring that the diverse safety and timeliness requirements of the applications are met.

This project aims to develop models, algorithms, and run-time management schemes for collections of applications with a mix of different timing and dependability requirements running on a shared heterogeneous CMP platform. In particular, a central objective is to develop a sound methodology to selectively apply known hardware and software fault tolerance mechanisms (such as modular redundancy, task replication, re-execution) to such mixed-dependability applications, by considering resource, power, and timing constraints simultaneously. A second objective is to extend the framework to tackle the challenge of intermittent run-time faults that occur in bursts and can affect multiple applications at once during a bounded time window. Success in these efforts could improve the safety and reduce the development and production costs of the increasingly complex cyber-physical systems upon which we all have come to depend.

Education and outreach activities include integration of aspects of the research into undergraduate and graduate courses at the two participating institutions, involvement of students as research assistants, and efforts to recruit student participants from under-represented demographic groups.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
1421855
Program Officer
Marilyn McClure
Project Start
Project End
Budget Start
2014-08-15
Budget End
2018-07-31
Support Year
Fiscal Year
2014
Total Cost
$269,966
Indirect Cost
Name
George Mason University
Department
Type
DUNS #
City
Fairfax
State
VA
Country
United States
Zip Code
22030