Skip to content

dealing with multicollinearity

2 messages · Manuel Gutierrez, ronggui

#
I have a linear model y~x1+x2 of some data where the
coefficient for
x1 is higher than I would have expected from theory
(0.7 vs 0.88)
I wondered whether this would be an artifact due to x1
and x2 being correlated despite that the variance
inflation factor is not too high (1.065):
I used perturbation analysis to evaluate collinearity
library(perturb)
P<-perturb(A,pvars=c("x1","x2"),prange=c(1,1))
Perturb variables:
x1 		 normal(0,1) 
x2 		 normal(0,1) 

Impact of perturbations on coefficients:
            mean     s.d.     min      max     
(Intercept)  -26.067    0.270  -27.235  -25.481
x1             0.726    0.025    0.672    0.882
x2             0.060    0.011    0.037    0.082

I get a mean for x1 of 0.726 which is closer to what
is expected.
I am not an statistical expert so I'd like to know if
my evaluation of the effects of collinearity is
correct and in that case any solutions to obtain a
reliable linear model.
Thanks,
Manuel

Some more detailed information:
Call:
lm(formula = y ~ x1 + x2)

Residuals:
      Min        1Q    Median        3Q       Max 
-4.221946 -0.484055 -0.004762  0.397508  2.542769 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -27.23472    0.27996 -97.282  < 2e-16 ***
x1            0.88202    0.02475  35.639  < 2e-16 ***
x2            0.08180    0.01239   6.604 2.53e-10 ***
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.'
0.1 ` ' 1 

Residual standard error: 0.823 on 241 degrees of
freedom
Multiple R-Squared: 0.8411,	Adjusted R-squared: 0.8398

F-statistic: 637.8 on 2 and 241 DF,  p-value: <
2.2e-16
Pearson's product-moment correlation

data:  x1 and x2 
t = -3.9924, df = 242, p-value = 8.678e-05
alternative hypothesis: true correlation is not equal
to 0 
95 percent confidence interval:
 -0.3628424 -0.1269618 
sample estimates:
      cor 
-0.248584
#
why not use vif command (from car library) to caculate the VIF to help you assess is a collinearity is infulential?

I have never  seen any book dealling with this topics by perturbation analysis.

the VIF,tolerance,principal component analysis are the tools dealing with collinearity.you can get the information from john fox's book.

generally,caculating the correlation directly is not essential.

one more thing,if your purpose of modeling is  prediction but not interpretation,collinearity does not matter much.


On Mon, 11 Apr 2005 12:22:55 +0200 (CEST)
Manuel Gutierrez <manuel_gutierrez_lopez at yahoo.es> wrote: