Browse Prior Art Database

Methods and Apparatus for Monitoring and Restarting Software Processes

IP.com Disclosure Number: IPCOM000199059D
Original Publication Date: 2010-Sep-09
Included in the Prior Art Database: 2010-Sep-09
Document File: 3 page(s) / 253K

Publishing Venue

Siemens

Related People

Juergen Carstens: CONTACT

Abstract

Software processes and groups of processes often tend to be instable depending on the circumstances in the running environment. Instability of processes can be caused by, for instance: • Unexpected disturbance of network connections which is not handled correctly by the processes. • Low memory space or high CPU load. • Processes can crash and another process depends on them. • Unexpected disconnection of a process from a database to which a process. • Wrong user input is passed to a process which does not handle wrong user input correctly and thus gets into an instable state or even crashes. Up to now, the following solutions are used: • Proper error handling is introduced to the processes. However, if a system consists of a huge set of single processes, it is often not possible to enhance all existing processes with proper error handling due to cost and effort.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 3

Methods and Apparatus for Monitoring and Restarting Software Processes

Idea: Bernd Steiner, DE-Nürnberg

Software processes and groups of processes often tend to be instable depending on the

circumstances in the running environment. Instability of processes can be caused by, for instance:
• Unexpected disturbance of network connections which is not handled correctly by the

processes.
• Low memory space or high CPU load.
• Processes can crash and another process depends on them.
• Unexpected disconnection of a process from a database to which a process.
• Wrong user input is passed to a process which does not handle wrong user input correctly and

thus gets into an instable state or even crashes.

Up to now, the following solutions are used:
• Proper error handling is introduced to the processes. However, if a system consists of a huge

set of single processes, it is often not possible to enhance all existing processes with proper

error handling due to cost and effort.
• Make processes more stable by using more stable technologies or less error-prone architecture.

But the same problem applies here: if too many processes need to be adjusted, cost and effort

are high.
• Perform in-depth code reviewsof processes during development. However, code reviews do

not avoid completely instable or unexpected program behavior.
• Perform in-depth system tests in order to cover as many test cases as possible. However,

most serious problems occur usually on customer systems, not in the test labs of

development.
• If processes crash, they may be restarted manually. Of course, this is not a valid solution for

serious products sold to customers.
• If processes behave instable, they may be terminated and restarted manually. However, the

in-memory data of such processes is lost and may thus lead to data inconsistency. This is also

not a valid solution for serious products sold to customers.

Therefore, a novel solution for monitoring and restarting software processes is proposed and shown in

Figure 1. The sequence diagram of the proposed solution is shown in Figure 2.
• A so-called Process Restart Manager (PRM) runs on each machine or in every product where

processes tend to show instability or unexpected behavior.
• The PRM can be implemented as a single process or as a service which runs in the

background and is always available.
• The software processes may subscribe for monitoring at the Process Restart Manager.
• After subscription is finished, the PRM monitors each single process subscribed for

monitoring. The subscribed processes continue to perform their actual tasks (see Figure 1).
• PRM monitors instantly the key process properties and checks for memory faults, critical

Input/Output errors and other problems or violations in...