A method of basing on pre-query and multi-dimension self-adaptive parallelization of deleting from database table
Publication Date: 2014-Apr-11
The IP.com Prior Art Database
This invention provides a method of basing on pre-query and multi-dimension self-adaption of parallelly deleting rows from database tables. Dividing the whole table into a lot of chunks of rows; Dividing the chunks into groups; Get the total number of target deleted rows;On each group, initiating multiple threads working from the top to the bottom and multiple threads from the bottom to the top. With fixed frequency, we calculate the data distribution density and move threads not only among chunk groups but also within the groups, so that most of threads can concentrate in the dense area where the target rows to be deleted concentrate. Using this method, parallel deletion can learn which area contains more target rows than the others and then move threads there.
Page 01 of 15
A method of basing on pre
A method of basing on pre-
--query and multi
query and multi
query and multi-
--adaptive parallelization of deleting from database table
adaptive parallelization of deleting from database table
Information technology and business development leads to fast growth of data size and, as a result, production database servers need to maintain and archive big tables that have billions of rows. After the archivism, it's usually important to delete huge number of rows for reducing redundancy and that can take a lot of time during which tables are often locked. Thus, making the deletion faster can significantly improve the efficiency of the production database servers.
Known solution for this problem
In order to delete a large number of rows, database provides two solutions with better performance compared with the ordinary run of DELETE statement.
###111..Disable database log before executing the "DELETE" SQL statements.
$ ALTER TABLE
ACTIVATE NOT LOGGED INITIALLY $ DELETE FROM…..
###222..Use the "TRUNCATE TABLE" SQL statements.
$ TRUNCATE TABLE
###333..Using Parallel Execution Delete of other DBMS.
When executing a parallel delete operation, the parallel execution coordinator obtains parallel execution servers from the pool and assigns them to the delete operation. Issuing multiple DELETE statements with different key value ranges or rowid ranges.
Page 02 of 15
Drawbacks and new solution required
2deletes the entire data in one table and "WHERE" clause is not applicable here. However, most of time, we can't remove all data but a part of the table - a big number of either consecutive or discrete rows.
1 speeds up the process of deleting rows by disabling database log and eliminating heavy DISK I/O. This solution scarifies the availability of database server and brings huge risk. During the run of DELETE statement, if the database server encounters any exceptions like power-off or instance crash or restart, this transaction will rollback and the whole table will turn into the status of "Drop Pending" and can't be used any more.
33 DELETE operations can only be parallelized on partitioned tables. DELETE parallelism are not possible within a partition, nor on a nonpartitioned table. It only can blance the workload among parallel execution servers, using thelimit parallel execution servers to run DELETE operations, don't have the self-adaption mechanism according to the distribution of target deleted rows.
We need a deletion solution that keeps database log, works on any types of tables, supports filtering and performs significantly faster.
The essential idea of this invention
The essential idea of this invention
This invention provides a method of multi-dimension self-adaption of para...