Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

# Extracting Rules From Backpropagation Units

IP.com Disclosure Number: IPCOM000120131D
Original Publication Date: 1991-Mar-01
Included in the Prior Art Database: 2005-Apr-02
Document File: 7 page(s) / 209K

IBM

## Related People

Henckel, JD: AUTHOR [+2]

## Abstract

Described is a method for converting the function performed by a backpropagation unit into a set of if-then rules. This method is not limited to backpropagation but can be used to extract knowledge from any trained feed-forward neural network. The set of rules is exactly the essential prime implicants of the function computed by the unit.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 34% of the total text.

Extracting Rules From Backpropagation Units

Described is a method for converting the function
performed by a backpropagation unit into a set of if-then rules.
This method is not limited to backpropagation but can be used to
extract knowledge from any trained feed-forward neural network.  The
set of rules is exactly the essential prime implicants of the
function computed by the unit.

Other keywords:  connectionist, expert system, knowledge
acquisition, lattice search pruning, learning, rule generation,
explanation.

Fig. 1 shows an example of a BP unit with four input variables.
For our purposes, we will assume input variables are binary, 0 or 1.
The weight, wi associated with each input variable and the threshold,
t, are real numbers.

The unit's operation is to first find its net input by

(Image Omitted)

and then to
compute its output

This formula is called the activation function.  It is a
squashing function so the output is a real value in the range (0 1).
An output above 0.9 is generally interpreted as a 1 or TRUE, and an
output below 0.1 as a 0 or FALSE. Other output values are interpreted
as ambiguous.

Our rule extraction algorithm efficiently converts the function
performed by a BP unit to a minimal set of production rules.  A
"minimal" set means that the total size of the left-hand sides of the
productions is as small as possible.

Before describing the algorithm, we will make some
observations.
1.  The function performed by a BP unit depends entirely on its
threshold and weights.  Therefore, this is all the information needed
to extract rules.
2.  The inputs with associated weights with the greatest absolute
value have the greatest effect on the output of the unit.
3.  The activation function is monotonic and tends to 1 for large
positive net and tends to 0 for negative net.

Each input variable has a corresponding weight.  When BP is in
learning mode, these weights and the threshold are adjusted.
Afterwards,  the weights and threshold are not changed.  Inputs of 0
have no effect on the net.  Inputs of 1 with negative corresponding
weights decrease the net and those with positive corresponding
weights increase the net.

The production rules to be generated will take two forms:
if <expression1> then output is high
if <expression2> then output is low

The <expression> is a conjunction of statements about the input
variables.  (Allowing disjunction would not make the expressions more
powerful.)

In Fig. 1, suppose W = (5, 4, -3, -2) and t=2. Conveniently,
these weights are in sorted order by absolute value.  If they were
not, we would use a table sort to index them in sorted order.  If
i1=1 and i4=0 then the net would be at least 4, so the rule
(1)  if i1=1 and i4=0 then output is high is valid.  Notice that
(2)  if...