A method to keep good response time of on-line system on multi system environment
Original Publication Date: 2000-May-01
Included in the Prior Art Database: 2003-Jun-18
Disclosed is a method that reduces time to wait for recovery of a complex file system used by mission critical on-line system in multi-system environment. When read or write operation is requested but the response from disk is not returned within an expected period, it is called I/O timeout. A complex file system may have one or more mirrored files or spare files which are prepared for disk system failure. And such file system usually gets exclusive control to access the file and keeps data integrity. When an error occurs on such file system, a mirrored file takes over failed file and the mirrored file becomes a main file immediately. But in case of the error is not notified to the I/O request system, I/O request system has to wait recovering the defected file. A mission critical on-line system cannot wait the recovery of failed file. If on-line system detects I/O timeout against a complex file system and it is caused by an error, on-line system can detach the defected file and continues processing. But, usually on-line system cannot determine whether it is caused by a physical error or other reason, for example, another system is reading or writing that file with having exclusive control. With this disclosed method, on-line system can determine timeout reason correctly and degradation of transaction response time is minimized.