Browse Prior Art Database

Improving Dependability using Shared Supplementary Memory and Opportunistic Micro Rejuvenation in Multi-tasking Embedded Systems

IP.com Disclosure Number: IPCOM000160373D
Original Publication Date: 2007-Nov-16
Included in the Prior Art Database: 2007-Nov-16
Document File: 8 page(s) / 401K

Publishing Venue

Motorola

Related People

Vinaitheerthan Sundaram: INVENTOR [+5]

Abstract

We propose a comprehensive solution to handle memory-overflow problems in multitasking embedded systems thereby improving their reliability and availability. In particular, we propose two complementary techniques to address two significant causes of memory-overflow problems. The first cause is errors in estimating appropriate stack and heap memory requirement. Our first technique, called Shared Supplementary Memory (SSM), exploits the fact that the probability of multiple tasks requiring more than their estimated amount of memory concurrently is low. Using analytical model and simulations, we show that reliability can be considerably improved when SSM is employed. Furthermore, for the same reliability SSM reduces total memory requirement by as much as 29.31%. The second cause is the presence of coding Mandelbugs, which can cause abnormal memory requirement. To address this, we propose a novel technique called Opportunistic Micro-Rejuvenation, which when combined with SSM, provide several advantages: preventing critical-time outage, resource frugality and dependability enhancement.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 8% of the total text.


Improving Dependability using Shared Supplementary Memory and Opportunistic Micro Rejuvenation in Multi-tasking Embedded Systems

Vinaitheerthan Sundaram, Sandip HomChaudhuri, Sachin Garg, Chandra Kintala, Saurabh Bagchi

 


Abstract

We propose a comprehensive solution to handle memory-overflow problems in multitasking embedded systems thereby improving their reliability and availability. In particular, we propose two complementary techniques to address two significant causes of memory-overflow problems.  The first cause is errors in estimating appropriate stack and heap memory requirement. Our first technique, called Shared Supplementary Memory (SSM), exploits the fact that the probability of multiple tasks requiring more than their estimated amount of memory concurrently is low. Using analytical model and simulations, we show that reliability can be considerably improved when SSM is employed. Furthermore, for the same reliability SSM reduces total memory requirement by as much as 29.31%. The second cause is the presence of coding Mandelbugs, which can cause abnormal memory requirement. To address this, we propose a novel technique called Opportunistic Micro-Rejuvenation, which when combined with SSM, provide several advantages: preventing critical-time outage, resource frugality and dependability enhancement.

Keywords: stack overflow, heap overflow, embedded systems, software rejuvenation, resource constrained fault-tolerance

1. Introduction

Embedded systems are becoming increasingly ubiquitous [2] and in recent years, their complexity has grown exponentially. Specifically, the growth in requirements for real-time, networked and multi-tasking applications has led to increased presence of MandelBugs/HeisenBugs [7] in the code. These are rare bugs which are triggered typically by the change in the program run-time environment or the (invalid) input data or due to ageing of the software and are thus not deterministic in nature. It has also made the task of accurately estimating run-time memory requirement (stack and heap) significantly harder. Inaccurate estimation of memory requirements and MandelBugs cause memory overflow or out-of-memory, a grave problem in embedded systems. It leads to unexpected system crash [8] since most embedded systems, as much as 95% according to Middha et al [4], do not use virtual memory. The effect of a system crash can be a slight inconvenience (e.g. crash of set-top boxes), loss of revenue (e.g. dropped call on mobile phones), or even loss of life (e.g. crash of aircraft controller). This has increased the need for designing these systems with fault-tolerance mechanisms to enhance reliability, not just from safety perspective but also from the user-experience standpoint.

In this paper, we propose a comprehensive solution that addresses both important causes of memory overflow, na...