Recovery From VPD Corruption
Publication Date: 2014-Aug-26
The IP.com Prior Art Database
AbstractThis disclosure proposes ways to mitigate the effects of a VPD corruption on the premise that a VPD corruption need not mean a problem with the actual hardware circuitary. The core idea of this disclosure is to automatically fetch the VPD required for initialization of the hardware from other sources in the event of a VPD corruption. The other sources include an identical hardware on the system -or- an identical hardware in the datacenter -or- the manufacturer's datastore.
Page 01 of 2
Recovery From VPD Corruption
Vital Product Data (VPD) is a collection of configuration and informational data associated with a particular set of hardware or software. Vital product data (VPD) stores information such as part numbers, serial numbers, and engineering change levels. Not all devices attached to a system will provide VPD, but it is often available from major components such as Processors, DIMMs, IO Cards. VPD data is typically burned onto EEPROMs associated with various hardware components and can be queried through attached I2C buses. VPD can also store hardware specific initialization and characterization information,
e.g: the VPD of a processor consists of the voltages, frequency, number of cores. This information is used by firmware to determine the nature of the system hardware and to initialize the same, needless to say that a corruption in VPD renders the hardware uninitializable (and thus unusable) even though there could be no real problem with the actual hardware circuits. The impact of VPD corruption is much higher, if the associated hardware is involved in Host OS bootstrapping (e.g: primary processor/DIMMs).
This disclosure proposes ways to mitigate the effects of a VPD corruption on the premise that a VPD corruption need not mean a problem with the actual hardware circuitary.
Known solutions to this problem include using ECC in VPD or having redundant VPD chips. None of these helped in a real field problem (PMR 31458,644,644). The problem with these approaches is that they rely on firmware's ability to read the the VPD data from the EEPROM chips. However, there could be electrical interferences -or- faulty buses that could prevent firmware from reading the VPD. In the real field problem, a faulty processor was being replaced with ones IBM had shipped as replacements. The replacements were thoroughly tested before shipping. Yet, the firmware was unable to read the processor VPD and thus declared failure. Please note that the firmware did not run any actual diagnostics on the processor, the failure was declared just because the VPD could not be read (makes sense from firmware perspective because it does not know how to intialize and diagnose this chip).
Thus an additional solution is required.
References: http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=/netahtml/PTO/search-b ool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN/7568123: This invention talks about storing a backup of VPD data on a management system
The core idea of this disclosure is to automatically fetch the VPD required for initialization of the hardware from other sources in the event of a VPD corruption. The other sources include an identical hardware on the system -or- an identical hardware in the datacenter -or- the manufacturer's datastore.
On POWER servers, the Service Processor (SP) runs the firmware stack to initialize the hardware and according to this disclosure, if a SP detects an uncorrectable VPD corruptio...