Browse Prior Art Database

# Statistical Training of Artificial Neural Network Functions

IP.com Disclosure Number: IPCOM000100282D
Original Publication Date: 1990-Mar-01
Included in the Prior Art Database: 2005-Mar-15
Document File: 2 page(s) / 62K

IBM

## Related People

Bakis, R: AUTHOR [+1]

## Abstract

A training algorithm is described for use with artificial neural network functions (multilayer preceptrons). The algorithm is inherently statistical in contrast to existing algorithms (e.g., minimum mean square back propagation) which are essentially deterministic.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Statistical Training of Artificial Neural Network Functions

A training algorithm is described for use with artificial
neural network functions (multilayer preceptrons).  The algorithm is
inherently statistical in contrast to existing algorithms (e.g.,
minimum mean square back propagation) which are essentially
deterministic.

Let y = fr(x) denote a feed-forward neural network function
whose argument (input) is a vector x = (x1,...,xn) and whose output
is a vector y = (y1,...,ym).  Given training vector pairs {x(t), y(t)
t=1,...,T} the state of the art training method is back propagation
wherein the trainable parameters r = (r1,...,rk) are obtained by
minimizing some error metric similar to
SS(yi(t) - (fr(x(t)))i)2
ti
This objective function fails to utilize possible stochastic
descriptions of the data excepting the very special case where the
output vector is assumed to have independent Gaussian components with
a common variance; such assumptions are rarely valid and are
completely inappropriate for, e.g., networks with binary outputs.

STEP 1.  Describe an input-output pair as a jointly distributed
pair of random vectors (X,Y) with density (or probability) function
pr(x,y).  The parameters r will be precisely the neural network
parameters r we must learn.

STEP 2.  Regard the output of the neural net function fr as the
expected, i.e., average of Y for a fixed input x, i.e., identify
f (x) = E[Y¯X=x] = ?yp(x,y)dy
r                 ?p(x,y)dy where the integrations...