Developers of networked embedded systems often find it difficult to diagnose bugs. A key observation is that in such systems, it can be beneficial to exploit domain knowledge about events in the physical world to detect failures. For example, in a sensor network deployment, knowing that the received signal strength of a radio transmission will normally decrease over distance, the application developer can enforce runtime checks to detect faulty nodes based on their relative distances to the source and the orderings of their received signal strength.

Based on this intuition, this project addresses the challenge of developing correct, resilient, and reliable networked embedded systems by (i) proposing, developing, and evaluating a methodology of using physical events to detect software bugs, (ii) developing software libraries and APIs to facilitate easy access to physical event constraints by application developers, and (iii) evaluating the effectiveness of the software libraries using real-world applications.

The completed framework could significantly reduce the debugging and maintenance costs for complicated networked embedded systems, and improve their reliability. Beyond such direct social and economic benefits, the broader impacts of this work include: (i) improving curriculum with hands-on debugging sessions; (ii) raising interest in technology among high school seniors through a Pre-Collegiate Research Scholars Program; (iii) supporting talented female and under-represented minority PhD students to successfully accomplish their doctoral studies; (iv) disseminating research results through high-quality publications, high-profile tutorials, and open-source sites.

Project Report

Embedded systems represent a class of computers that are highly resource constrained, and tightly integrated with other components to interact with the physical world. This project aims to address several unique challenges related to embedded systems, as developers often find it difficult to diagnose bugs in such systems, and to program them for application purposes. A key observation is that in such systems, it can be beneficial to exploit domain knowledge about events in the physical world to detect failures. For example, in a sensor network deployment, knowing that the received signal strength of a radio transmission will normally decrease over distance, the application developer can enforce runtime checks to detect faulty nodes based on their relative distances to the source and the orderings of their received signal strength. Based on this intuition, this project addresses the challenge of developing correct, resilient, and reliable networked embedded systems. The outcomes of this project is listed in two categories: research findings and outreach activities. In the category of research findings, the main outcome is that, through collaboration of UMN and UTK, we complete a debugging protocol design called FIND, which addresses the detection of various types of bugs, not only including functional faults, where individual nodes may crash, but also data faults, in which a node behaves normally in all aspects except for its sensing results, leading to either significant biased or random errors. To deal with the data faults, we develop the FIND protocol that works as follows: after the nodes in a network detect a natural event, FIND ranks the nodes based on their sensing readings as well as their physical distances from the event. A node is considered faulty if there is a significant mismatch between the sensor data rank and the distance rank. In the end of the run, FIND provides a blacklist containing all possible faulty nodes. With such a list, further recovery processes become possible. This is the first faulty node detection method that assumes no a priori knowledge about the underlying distribution of sensed events/phenomena. This work has been accepted to ACM Transaction on Wireless Sensor Networks in 2014. In addition to the FIND protocol, the PIs also investigated other possible ways to improve the performance, reliability, and robustness of embedded platforms. For example, the PI Cao investigated compact data structures for key-value (k-v) storage that is particularly suitable for programming embedded platforms. The PI He has started a new direction on studying the Safe Charging Problem (SCP) of scheduling power chargers so that more energy can be received while no location in the field has electromagnetic radiation (EMR) exceeding a given threshold R. Finally, the co-PI Wang has mainly focused on the detection of energy bugs in system configuration files. These studies have led to more than 10 publications in top conferences and journals. On the side of outreach and education activities, the materials developed by this project has been used in multiple courses: the CS560 (advanced operating systems) the PI Cao developed and taught at UTK, CSCI4970W Advanced Project Laboratory the PI He taught for undergraduate students, and a graduate-level seminar course CSCI 8211/8910 the PI He developed, in which students are required to read, present and conduct research on related topics. The PIs have given a few talks in universities, workshop and conferences to disseminate the concept of robust embedded programming and its related technologies. Finally, the PI He has hosted a few visiting scholar/postdoc who work on related topics. The project has also been used to generate interest among high school students, through a pre-college summer high school student program offered at UTK, where the PI Cao taught a group of high school students on writing correct programs on embedded platforms. This program received very positive feedback from the students, with comments such as "fantastic" and "novel" on the course materials. The students used embedded devices as the major programming platform, and their learning material is closely related to the project’s research contents. Overall, the completed framework has significantly reduced the debugging and maintenance costs for complicated networked embedded systems, and improved their reliability. Under the funding constraint of this project (as a seed funding), we have done our best to accomplish research objectives. If this research would be funded more significantly, we will be able to systematically carry out the integration of our research methods with realistic systems for large scale application deployments, so that we can obtain datasets through experiences on practical improvements on the performance of networked embedded systems. The broader impacts of this work include: (i) improving curriculum with hands-on debugging sessions; (ii) raising interest in technology among high school seniors through a pre-collegiate scholars program; (iii) supporting talented female and under-represented minority PhD students; (iv) disseminating research results through high-quality publications, high-profile tutorials, and open-source sites.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
1117384
Program Officer
Theodore Baker
Project Start
Project End
Budget Start
2011-09-01
Budget End
2013-12-31
Support Year
Fiscal Year
2011
Total Cost
$150,000
Indirect Cost
Name
University of Tennessee Knoxville
Department
Type
DUNS #
City
Knoxville
State
TN
Country
United States
Zip Code
37996