A Mechanism for Parallelizing Database Interactions for Improved Performance of Applications Persisting or Reading Data from a Storage System.
Original Publication Date: 2005-Nov-09
Included in the Prior Art Database: 2005-Nov-09
A mechanism for applying parallel database access on persistent objects for better performance.
A Mechanism for Parallelizing Database Interactions for Improved Performance of Applications Persisting or Reading Data from a Storage System .
The Enterprise Java Beans (EJB) specification defines bean's lifecycle methods of ejbCreate/ejbStore/ejbRemove. It is used to control creating/updating/removing of an EJB entity bean , who has a physical data representation in a datastore(e.g. a row in relational database table). Some of the lifecycle methods are driven by application code explicitly, like ejbCreate and ejbRemove; while the others are driven by a container when there is a need, e.g. ejbStore is called at the end of transaction to make the changed data available for later access. Inside Websphere, EJB CMP has a feature that defers all the call to database to insert/update/delete a row until needed. The feature is called "deferred update". Persistence Manager held all the change object history until flush is needed, either at the end of transaction or before a finder method. Then we push the object level change history to the database.
Similarly, in the case of SDOs, SDO data graph holds all the object change history until applyChange is called, then object change history is these database operations (identified through change history) are executed in a serial order and not in parallel order on multiple threads.
In both EJB and SDO case, when flush / apply change happens, the calls to database to insert/update/delete a row are serialized, which means that during a transaction, we wait for one database operation to finish before starting another, even though logically the operations may not be related to each other: e.g. without CMR definition, EJBs are not related to each other within a EJB module, or EJBs belongs to different EJB modules are not related, so the insert/update/delete operation on not related ejb types does not need to follow any order. In SDO case, if data graph has none related sub graphs, then the persistence to the individual sub graph does not need to follow any order.
Similarly, in the read path, when a user triggers two finder methods without iterating the result, which can happen in both EJB customer defined finders or SDO data graph loading, today we trigger two finder methods in a sequential order and only when the first finder method's result comes back, will we kick off the other finder. This serialized approach does not make the best use of CPU time on a multiprocessor system or systems where the database is remote.
This invention introduces a mechanism to start issuing parallel database operations for EJB CMP (Container Managed Persistence) beans or SDO to achieve better performance by making the best use of CPU time and especially in the multi-CPU case allow for a higher degree of parallelism for the runtime.
The same scheme can be used by other object persistence systems.
To achieve parallel database access in a database agnostic manner the following invention is needed:
First, different progra...