Browse Prior Art Database

Deriving a Heuristic Function of an A* Search for Optimal Reducers in Query Processing

IP.com Disclosure Number: IPCOM000102709D
Original Publication Date: 1990-Dec-01
Included in the Prior Art Database: 2005-Mar-17
Document File: 4 page(s) / 137K

Publishing Venue

IBM

Related People

Chen, MS: AUTHOR [+2]

Abstract

A heuristic function of an A* search for an optimal sequence of reducers to process distributed tree queries is derived in this disclosure. Based on the minimal cardinality and the minimal width of each relation, which are respectively determined by two algorithms proposed, the heuristic function is derived and shown to be able to lead to an optimal sequence of reducers.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 45% of the total text.

Deriving a Heuristic Function of an A* Search for Optimal Reducers in Query Processing

       A heuristic function of an A* search for an optimal
sequence of reducers to process distributed tree queries is derived
in this disclosure.  Based on the minimal cardinality and the minimal
width of each relation, which are respectively determined by two
algorithms proposed, the heuristic function is derived and shown to
be able to lead to an optimal sequence of reducers.

      The problem of determining an optimal sequence of reducers for
distributed query processing has been proved to be NP-hard, and it is
therefore necessary to resort to a heuristic search method to deal
with the problem.  Among various search methods, A* is a well-known
heuristic search method based on state transition, and has been
applied to the problems in artificial intelligence as well as the one
in finding a sequence of semijoins (*).  Note that an optimal
sequence of reducers for distributed query processing consists of
joins and semijoins.  After a set of reducers are performed, the
resulting query graph and the corresponding profile can be obtained
from the original ones according to the operations applied.  In view
of this, we denote a query graph with its profile as a state of the
query.  The original query graph with its profile is called the
initial state and a state with the final answer for the query in its
corresponding profile is called a goal state. In the A* search
algorithm, each state of a query graph is represented by a node in
the search graph and the search is guided by the value of an
evaluation function f on each node.  The function f(p) is formulated
as f(p) = g(p) + h(p), where g(p) is the cost required from the
initial state to state p and h(p) is a lower bound of the cost
required from state p to a goal state.  Note that the condition for
h(p) to be a lower bound of the actual cost required is a necessary
and sufficient condition to lead to an optimal solution.  The
objective of this disclosure is to develop a heuristic function for
an A* search to determine an optimal sequence of join and semijoin
reducers for tree queries.

      Let p be a state in the search graph and S(p) be the set
of reducers, joins and semijoins, applied to the initial state to
reach state p.  Thus, g(p) is the cost required for the operations in
S(p), i.e., the cost to reach state p from the initial state.  Note
that due to the execution of joins and semijoins, the cardinality and
the width of a relation may change.  To determine a heuristic
function h(p) which can serve as a lower bound of the cost required
to finish the query from state p, we determine, respectively, the
minimal cardinality and the minimal width achievable by each relation
resulting from all possible reducers after a given state p.  By
multiplying the minimal cardinality and the minimal width of a
relation, we obtain a lower bound for the size of the relation after
state p, i.e....