Browse Prior Art Database

Density Estimation by Stochastic Complexity

IP.com Disclosure Number: IPCOM000102268D
Original Publication Date: 1990-Nov-01
Included in the Prior Art Database: 2005-Mar-17
Document File: 2 page(s) / 55K

Publishing Venue

IBM

Related People

Rissanen, J: AUTHOR

Abstract

An algorithm for estimating a probability density function of the histogram type is disclosed. Unlike the histogram estimators in the prior art, which have equal-width bins, the new estimator has variable-width bins, which are optimized by a code-length criterion.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Density Estimation by Stochastic Complexity

       An algorithm for estimating a probability density
function of the histogram type is disclosed.  Unlike the histogram
estimators in the prior art, which have equal-width bins, the new
estimator has variable-width bins, which are optimized by a
code-length criterion.

      Different density estimates can be compared by the code length
with which the observed data, together with the estimator itself, can
be encoded.  For histogram estimators an idealized version of the
code length for the data, called the stochastic complexity, can be
calculated by a formula in which the bin boundaries appear as
parameters.

                            (Image Omitted)

in terms of the multinomial
                     n            n!
                (         ) =  ________
                 n1,...,nm       fini!
and the binomial
                  n + m-1      (n + m-1!
                (         ) =  _________
                     n          n!(m-1)!
coefficients.  Here, ni denotes the number of observations that fall
in the ith bin, a = 0,a1,ldots,am-1,am the bin boundaries with am =
R.  In order to reduce the code length needed to encode the
boundaries, two parametrically expressed constraints are imposed on
them.  The first is the precision in which the boundaries are
written, and the second is the minimum bin width.  Specifically, ai =
ki,d and the minimum bin width is kd....