Browse Prior Art Database

Method to increase test effectiveness of RAS functions

IP.com Disclosure Number: IPCOM000027507D
Original Publication Date: 2004-Apr-08
Included in the Prior Art Database: 2004-Apr-08
Document File: 4 page(s) / 63K

Publishing Venue

IBM

Abstract

This article describes a methodology to provide runtime error injsction via the Service Processor (SP) using the Serial Communications (SCOM) facility and "JTAG on the fly" functionality rather than the host processor Memory Mapped Input Output (MMIO) and SCOM operations.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 43% of the total text.

Page 1 of 4

Method to increase test effectiveness of RAS functions

In the current firmware test environment, the testing of Reliability, Availability, and Servicability (RAS) functions is particularly difficult in terms of the tools necessary to provide error injection and the ability to inject errors during specific code path execution. These tools require application layer test code, device driver support, Run Time Abstraction (RTAS) / Partition FirmWare (PFW) and hypervisor layer modifications. These tests are also intrusive because the tests require RTAS/PFW and hypervisor compile time modifications that make testings possible but also modify the firmware such that the shipped firmware is not the tested firmware. Modifications to the firmware are necessary to enable error injection for processor RAS testing such as Level 1 cache (L1-cache) , Translation Lookaside Buffer (TLB) array , Sector Lookaside Buffer (SLB)

array , Data Effective to Real Address Translation (DERAT) array, memory, and I/O subsystem errors but must be removed prior to shipping the firmware.

Also because each supported operating system must have a modified device driver developed to enable user access to the additional RTAS/PFW functions the cost of support for these tests is very high.. Currently in the pSeries program the RTAS and hypervisor firmware is required to be built with debug flags enabled and a device driver written for each of the two supported operating systems. For future programs the partition firmware and hypervisor require debug builds for testing and device drivers written for the three operating systems supported. The cost to support this current test methodology is very high but will be reduced by a very large factor using this invention.

Additionally, because the RTAS/PFW and hypervisor modifications have been made through conditional compilation, tests must be added prior to shipment to verify that all code that enabled these functions has been removed because the customer must not have access to these functions in the field.

Currently these tests are in-band tests and have additional limitation because there are no synchronization mechanisms that allow the errors to be inserted during the execution of specific code paths. This results in the lengthy test scenarios to try to randomly hit a particular routine with an error. This has resulted in excessively long test times, and may require days or weeks of testing to accomplish. Additionally the test methods based on a modified device drive cannot make error testing available during ipl of firmware and operating systems, leaving a major hole in the ability to test errors during the early stages of machine bringup.

This invention removes the requirements for any modifications to RTAS/PFW and hypervisor firmware and special device drivers through the use of out of band test methods. Additionally the method introduced also uses a facility that guarantees the test code cannot be shipped with the system...