Method for detecting condition waiting/triggering in a server farm
Original Publication Date: 2008-Jul-10
Included in the Prior Art Database: 2008-Jul-10
In a farm environment two processes on different servers could miss the condition waiting and condition triggering from each other. Disclosed is a mechanism to resolve the problem. The solution can be applied to a single server or multiple servers.
Method for detecting condition waiting /triggering in a server farm
In a server farm (shared disk cluster architecture) there are cases where a process (waiter) executing in one server must react to a data generated by another process in the farm (trigger) executing in any of the servers in the farm. It is common to use a database to store both the waiter and the trigger information. The problem is that both transactions, the waiter and the trigger, may happen at about the same time and so they may be isolated by the ACID properties of the database transactions. In this case, both the waiter and the trigger information will be added to the database, but the process adding the waiter data will miss the trigger transaction and the trigger will miss the waiter transaction. This happens because both transactions are executed in parallel and so they are isolated by the database.
In a single server environment this problem can be solved using a semaphore to synchronize the two or more processes (waiter and trigger) on the same server. However, the semaphore solution cannot be applied efficiently to different processes in a server farm environment. The semaphore synchronization could be simulated using locking in the database, but it is very inefficient and it seriously affects the performance of the processes/servers involved.
The solution is to have both the waiter and the trigger processes perform an additional transaction to check if the first transaction didn't detect the condition setting/triggering. In the second try at least one of the two processes is guaranteed to detect the setting/triggering of the other process's first try. Thus, the triggering will never be missed.
The solution eliminates the need to synchronize multiple processes (either on the same server or different servers) and thus avoid the performance impact. In general, the solution not only solves the problem, but also avoids the serialization of all the processes involved and provides better performance characteristics than traditional approaches.
The following scenarios show a case without the problem, a case with the problem, and a solution to the problem. All the scenarios involve two or more processes residing on different servers:
(1) A "waiter" process checks whether the condition has been triggered. If the condition has been triggered, this process will perform some operation with the trigger and the waiter data.
(2) A "trigger" process triggers the condition and checks if there is a waiter waiting for the condition. If it finds a waiter, it will perform some operation with the waiter and the trigger data.
The following abbreviations are used in the figures:
BT - Begin Transaction
ET - End Transaction
CT - Check whether a condition has been triggered
AW - Add a record to database to indicate waiting for the condition
CW - Check whether there is anybody waiting for a triggered condition AT - Add a record t...