CCR-0209094 "Hybrid Modeling and Analysis of Error Recovery in Safety Critical Flight Control Systems"

Embedded computer systems have become an essential component of technological products and systems. An example is the safety-critical real-time computer systems on board the Boeing 777, a digital fly-by-wire aircraft. To completely certify that safety-critical systems will operate as intended requires the validation and verification of both the software and the hardware. An additional challenge is the use of safety critical systems in harsh environments that produce electromagnetic interference (EMI) such as high intensity radiated fields (HIRF) or lightning. Under these harsh conditions, it is known that triple modular redundancy, error correcting codes, and other fault-tolerant computing techniques are of limited use, since multiple fault containment regions are near-simultaneously affected by correlated or common-mode faults.

The project is developing enhanced models and analysis tools from the ground up, that is, it starts with models of the physical system, the controller, and the environment in order to study the stability of closed-loop systems and the safety properties of embedded software. To make sure that the theoretical foundations being developed are sound, a particular class of systems is considered: computer systems with error recovery, which control physical processes and mitigate the effects of common-mode faults. The external events are triggered with a certain probability by the presence of a harsh electromagnetic disturbance. The internal events are generated by the error recovery logic. This class of systems is hybrid since it includes the continuous-time dynamics of the process under control and of the electromagnetic environment, the discrete-time dynamics of the controller, and the models for the transitions. The models and tools being developed are enhancements of switched system models and analysis tools. Their capabilities are validated together and independently with a particular flight control system. The controller is being implemented using an architecture that has been evolving for the past 30 years: rollback recovery. This architecture has been widely used in digital process control systems and in real-time database transaction systems. In particular, a rollback error recovery architecture using dual-lock step processors is part of a prototype of a recoverable computer system (RCS) being investigated by a NASA-industry partnership to deal with transient or soft common-mode faults. The new models will be validated using data from NASA Langley Research Center's HIRF Laboratory via a Cooperative Research Agreement. The analytical tools developed in this project will allow system designers to quickly evaluate new recoverable computer architectures before doing the more expensive and time consuming physical tests.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Application #
0209094
Program Officer
D. Helen Gill
Project Start
Project End
Budget Start
2002-08-15
Budget End
2004-07-31
Support Year
Fiscal Year
2002
Total Cost
$119,996
Indirect Cost
Name
Old Dominion University Research Foundation
Department
Type
DUNS #
City
Norfolk
State
VA
Country
United States
Zip Code
23508