Workload Aware Exception Detection
Original Publication Date: 2010-Jan-12
Included in the Prior Art Database: 2010-Jan-12
The administration of databases in todays business environments becomes increasingly complex. A wide range of monitoring products exist to help administrators to detect specific problem situations and therefore react as soon as possible. Thresholds are a common mechanism to model such situations but they are in general valid for only a specific type of workload. This paper describes a workload detection mechanism to extend a common database monitoring architecture to provides the possibility to switch between threshold sets if a workload change occurs.
Workload Aware Exception Detection
Disclosed is a procedure to detect database related exceptions ,e.g. specified problem situations, dependent on the current workload. Each database management system (DBMS) externalizes information about its current performance using a kind of instrumentation facility. The instrumentation facility externalizes the internal state of the DBMS by providing access to numerous performance metrics (system hit ratio, elapsed time for statement execution, etc.) and event counts (e.g. deadlocks detected, authorization failures occurred, etc.).
State of the art database monitoring tools provide means to alert database administrators (DBAs) about exceptional situations in DBMS using the information provided by the instrumentation facility. The simplest form of this notification mechanism (exception
processing) works as follows:
A DBA defines thresholds for certain performance metrics or event counts.
The database monitor checks the data provided by the instrumentation facility
against the defined set of thresholds periodically.
The database monitor notifies a DBA about any exceptional situation indicated by
In more sophisticated monitoring environments operators can be applied on threshold definitions to define complex exceptional situations. For example, an exceptional situation exists if a metric A violates threshold a and another metric B violates threshold b at the same time.
The idea behind exception processing is that current or future DBMS problems can be modelled using threshold violations. This mental model has an intrinsic disadvantage: the threshold definitions have to be modified if the workload processed by the DBMS changes significantly (calibration of threshold definitions). Otherwise a database monitor is bothering a DBA with problem notifications although no exceptional situation exists at all. For example, a buffer pool hit ratio of 80 % is not acceptable for OLTP like workloads. Database workload changes take place for several reasons:
OLTP-like database workload during daytime hours versus batch processing during night shift.
OLTP-like workload during normal business days versus batch processing at the end of month or quarter.
Ad hoc changes in the way a database is used (OLTP versus OLAP) during normal business hours.
DBAs can try to address this problem by scheduling tasks which modify the threshold definitions used by the exception processing function of a database monitor at certain points in time or periodically. This way end of month (quarter) processing and day/night processing can be handled to some extent. Nevertheless, this approach is error prone because day/night workload windows can be moving and mixed workload environments (e.g. production warehouse) seem to become more and more popular making it hard to define...