Automatic Remote Disable of Processor using SMI Disclosure Number: IPCOM000019255D
Original Publication Date: 2003-Sep-08
Included in the Prior Art Database: 2003-Sep-08
A local SP in an SMP server monitors environmental and CPU PFA events. When an event occurs indicating a CPU is going to fail, the local SP informs a remote management server (or SP) which may authenticate & validate the need for the CPU to be taken offline. If so, the remote management server/SP responds to the local SP requesting an appropriate SMI in BIOS be invoked. In the SMI, BIOS transfers thread tasks running atop the faulting CPU to an available CPU, & brings down the faulting CPU to a stand-by or powered-off state. With BIOS handshaking to the local SP, a notice of the avoided fault may be sent back to the remote management server/SP. Several advantages to this technique: 1. A wider mapping of PFA & other system-wide information may be used to identify the CPU is about to fail (ie, thermal core overtempurature). 2. A policy on the remote system may be used to 2.1 analyze the reason the CPU appears to be faulting, to validate the request to take that CPU offline 2.2 allow a customer a choice in using this automatic fault-avoidance technique

Automatic Remote Disable of Processor using SMI

Remote Management Server or Service Processor





Management SP


A flow of events may proceed as follows:


Local SP detects CPU PFA or other critically suspect event

Local SP alerts remote management with some detail of event to request CPU shutdown authorization

Remote SP or management server receives request to shut down CPU

Automatic CPU shutdown enabled for that server?

Is the fault type a valid reason to allow CPU shutdown?





Log the rejected CPU PFA auto-recovery reason

Local SP allows CPU system to remain active

Local SP invokes SMI to gracefully disable CPU

Send "Permit CPU shutdown response"

Local SP reports status of shutdown to remote management

Log the accepted CPU PFA auto-recovery reason status


