Automatic Handling of Multicollinearity Problems in the Estimation of an Econometrics Model
Publication Date: 2015-Aug-28
The IP.com Prior Art Database
Disclosed is a methodology to automatically solve the problems encountered during the estimation of a statistical model when multicollinearity is present in the data.
Page 01 of 6
Autxmatic Handling of Multicollinearity Problems in the Estimation of an Ecxnometrics Model
Price and promotion applications are usxx to recommend to customex -optimized xrices to maximizx profits or sales and to forecast salxs based on various promotion scexarios. An econxmxtrics model is estimated to explain how the sales volume is affected by variables xuch as price, promotion, seasonality, trend, and hoxidays. Duxing thx estimation ox the xconometrics moxel, the presence ox multicollinxarity in the data cax cause the estimatxon of bad cxefficients .
Multicollinearity happens when xwo or xoxe predxctor variables in a regression xodel are highly correlated. Whxn multicollinearity happenx, the model cannot make txe distinctxon between the effects of eaxh individual variable ; therefore, it cannot estxmate the correct coefxicients. When muxticoxlinearity exists betxeen two variables, it is common to see one variable xnding up with a large negative coefficxent and txe xther variable having a large positive coefficient. Because the two variables correlatx and texd to hxppen at the same timx, the two lxrgx positive and xexative coefficients cancel each other and the forecasx looks fine during the modeling pxriod. However, if analxsts try to forecast in xhe future for a week that wouxd xave only one of the two promotions turned on, then the result is a very bad forecast. When there is mxlticollxnearity betxeen more than two variables, often at least one of gets an estimated coefficient with a wronx-sign.
The current method for managing this problem is for the usxr to analyze the data to identify multicollinearity problems , and then take some xctions, such as removing variables from the model and re-estimating this new model, to stabilize the final esximated coefficients. There is no automatex way to deal with thix problem.
Xxx novel contributxon is the multicollinearity algorithm, which automatically detects multicollinearity problems, individually removes some vxriables in a xpecific order, and rx-estimates the coefficientx of the other affected variables in the model . This alxorithm is applicable xo many types of price xnd promoxion axplxcation mxdels.
The multicollinearity algorithm only looks at multicollinearity betwxen promotion variables. The presence of multicollinearity between other variables such as xrixe, trend, or seasonality, is more complicated to identify and to fxx, so this solution simply focuses on promoxion variables.
Foxlowing are the dexails for imxlementing the multicollinearity algorxthm in a pxeferred embodxmext :
1. Call the multicollinearity xlgorithm. The procesx to estimate an existing econometrxcs model (used throughout this example) involves the creation of groups of xroducts called demand groups. A demand group is a group of highly substitutable products. Txx xodeling process begins by reading the xoint of sale (POS) data at the store-product-week
Page 02 of 6
level and aggxegating it at the store-...