Browse Prior Art Database

Automatic Remote Disable of Processor using SMI

IP.com Disclosure Number: IPCOM000019255D
Original Publication Date: 2003-Sep-08
Included in the Prior Art Database: 2003-Sep-08
Document File: 3 page(s) / 76K

Publishing Venue

IBM

Abstract

A local SP in an SMP server monitors environmental and CPU PFA events. When an event occurs indicating a CPU is going to fail, the local SP informs a remote management server (or SP) which may authenticate & validate the need for the CPU to be taken offline. If so, the remote management server/SP responds to the local SP requesting an appropriate SMI in BIOS be invoked. In the SMI, BIOS transfers thread tasks running atop the faulting CPU to an available CPU, & brings down the faulting CPU to a stand-by or powered-off state. With BIOS handshaking to the local SP, a notice of the avoided fault may be sent back to the remote management server/SP. Several advantages to this technique: 1. A wider mapping of PFA & other system-wide information may be used to identify the CPU is about to fail (ie, thermal core overtempurature). 2. A policy on the remote system may be used to 2.1 analyze the reason the CPU appears to be faulting, to validate the request to take that CPU offline 2.2 allow a customer a choice in using this automatic fault-avoidance technique

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 100% of the total text.

Page 1 of 3

Automatic Remote Disable of Processor using SMI

Remote Management Server or Service Processor

CPU

CPU

CPU

CPU

Management SP

Network

A flow of events may proceed as follows:

1

[This page contains 2 pictures or other non-text objects]

Page 2 of 3

Local SP detects CPU PFA or other critically suspect event

Local SP alerts remote management with some detail of event to request CPU shutdown authorization

Remote SP or management server receives request to shut down CPU

Automatic CPU shutdown enabled for that server?

Is the fault type a valid reason to allow CPU shutdown?

No

No

Yes

Yes

Log the rejected CPU PFA auto-recovery reason

Local SP allows CPU system to remain active

Local SP invokes SMI to gracefully disable CPU

Send "Permit CPU shutdown response"

Local SP reports status of shutdown to remote management

Log the accepted CPU PFA auto-recovery reason status

2

[This page contains 3 pictures or other non-text objects]

Page 3 of 3

Disclosed by International Business Machines Corporation

3