Browse Prior Art Database

A method of recovering from PCI errors using SMI

IP.com Disclosure Number: IPCOM000126968D
Original Publication Date: 2005-Aug-16
Included in the Prior Art Database: 2005-Aug-16
Document File: 1 page(s) / 21K

Publishing Venue

IBM

Abstract

SMI# handler will turn off power to a failing slot, when it is invoked due to PCI errors from the slot. Since, the SMI# has higher priority than the NMI, the OS will see the PCI error as unsafe hot removal, which may be recoverable for the OS device driver.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 100% of the total text.

Page 1 of 1

A method of recovering from PCI errors using SMI

     PCI errors are typically reported using SERR# which causes SMI# for logging the error and NMI to crash the system due to hardware malfunction. However, the OS may be able to recover from it. (for example: in the case of unsafe hot removal of device).

Since there is no architected way to communicatevbetween SMI# handler provided by the platform and NMI handler provided by the os, the OS has no way to attempt a recovery from the PCI errors. If there is an unsafe removal from a PCI slot, SHPC automatically powers off the PCI slot. OS merely reports an unsafe removal with the help of SHPC driver or ACPI AML code. However, there is no crash upon powerfault; therefore; the SMI# handler can power off the PCI slot upon a PCI error instead of causing an NMI, which will then be the same as unsafe removal. The PCI interface is disabled upon a PCI error just like it is also disabled upon powering it off. The exposure to data corruption is the same as in power fault caused by unsafe removal which the device driver is able to detect it and make a decision on whether to crash or not.

1