Parallel alternative database queries on multi-node big data stores - first to complete wins
Publication Date: 2018-Jul-11
The IP.com Prior Art Database
Disclosed is a system for optimizing query performance for NoSQL clustered
databases by introducing design level query parallelism. It can be applied to
any distributed NoSQL database (e.g. Apache Cassandra) with replication factor larger
than 1 and nodes number larger than 1.
Typical use case is as follows: there is a query A that performs well (with
acceptable performance) for some data (when executed on some database content
structure), but poorly (with unacceptable performance) for different data. At the same
time there exists alternative query B, that corresponds to the query A in a sense that
both queries are returning the same results for every possible database content
structure, but a different design of the query B leads to following behavioral differences:
it performs well for all or some data for which query A performs poorly and performs
poorly for all or some data for which query A performs well. Every time we are to
execute the query, we want to achieve best possible performance, ideally would be to
take the version of the query (A or B) that performs best for the current database
content. Query is executed by an application on database that can have various content
structures, so it cannot be easily predicted in advance which query would be the best at
a given time.
The solution is to always execute both versions of the query: A & B in parallel.
The result will be taken from the query that executes first, the other query will be then
canceled, see figure.
Described solution has the following benefits: always the best query time performance is
achieved, due to data replication and clustering, both queries can be executed on
different data replicas and different cluster nodes, not hampering each other
performance. Generally, there can be more than two versions of the query e.g. A1, A2,
... An. Every version of the query must perform well for some data for which none of the
other versions can. Replication factor of the...