Browse Prior Art Database

Bias Reduction for the Random Forest Learning Machines

IP.com Disclosure Number: IPCOM000021712D
Publication Date: 2004-Feb-04
Document File: 3 page(s) / 23K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for learning random forest (RF) models with increased predictive accuracy. The disclosed method modifies the split construction algorithm, favoring variables that have better prediction power over the response. Substantial improvement in predictive accuracy is obtained when many irrelevant inputs are present both in classification and regression.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Bias Reduction for the Random Forest Learning Machines

Disclosed is a method for learning random forest (RF) models with increased predictive accuracy. The disclosed method modifies the split construction algorithm, favoring variables that have better prediction power over the response. Substantial improvement in predictive accuracy is obtained when many irrelevant inputs are present both in classification and regression.

Background

Supervised statistical learning theory finds a dependence on a response variable with a set of predictor variables then predicts the response. Prediction of a categorical response is called a classification problem, and the prediction of a numeric response is called a regression problem. A decision tree (DT) is a set of decision rules that predict the value of the response variable based on the values of the prediction variables. The most important notion in a decision tree is a so-called “split” that is an atomic rule defined on one predictor variable. An example of a split would be x > 10 for a numeric variable, and x Î {“red”, “orange”, “magenta”} for a categorical variable. A split has a measure of predictive power called a split weight. The quality of the model shows how well it can predict the response of the data. The algorithm for learning an optimal decision tree is intractable. The well-known approximation by Breiman et. al. is called a Classification and Regression Tree (CART), and is a simple iterative search. A single iteration step involves constructing the splits with maximum weights on each of the predictor variables, comparing their weights and choosing the one with the highest value. Being the simplest version of decision tree learning, this algorithm is known for constructing DTs with poor predictive power.

A substantial improvement on the CART for classification problems is the algorithm called Random Forest (RF). Unlike other state of the art learning machines, it has unique features that make it invaluable when interpretability and informative data exploration is the primary goal.

Instead of building one large DT, RF builds a set of DTs that may be thought of as a collection of tree predictors. Each DT is built independently and the response is calculated as a function of responses from each tree (e.g. if the re...