Skip to content

Hausman test in R

6 messages · Bert Gunter, Joshua Wiley, John C Frain +1 more

#
Hi there,

I am really new to statistics in R and statistics itself as well.
My situation: I ran a lot of OLS regressions with different independent
variables. (using the lm() function).
After having done that, I know there is endogeneity due to omitted
variables. (or perhaps due to any other reasons).
And here comes the Hausman test. I know this test is used to identify
endogeneity. 
But what I am not sure about is: "Can I use the Hausman test in a simple OLS
regression or is it only possible in a 2SLS regression model?" "And if it is
possible to use it, how can I do it?"

Info about the data:

data = lots of data :)

x1 <- data$x1
x2 <- data$x2
x3 <- data$x3
x4 <- data$x4
y1 <- data$y1

reg1 <- summary(lm(y1 ~ x1 + x2 + x3 + x4))

Thanks in advance for any support!



--
View this message in context: http://r.789695.n4.nabble.com/Hausman-test-in-R-tp4647716.html
Sent from the R help mailing list archive at Nabble.com.
#
1. These are primarily statistics issues, not R issues. You should
post on a statistical help list like stats.stackexchange.com, not
here.

2. However, given your acknowledged statistical ignorance, you may be
asking for trouble. I suggest you seek help from a local statistical
expert to get you started. Then, depending on your statistical
background, you may understand enough to drive safely on your own.

Also try at the R command prompt:

install.packages("fortunes")
library(fortunes)
fortune("brain surgery")

Cheers,
 Bert
On Sun, Oct 28, 2012 at 1:33 PM, fxen3k <f.sehardt at gmail.com> wrote:

  
    
#
Given my "acknowledged statistical ignorance", I tried to find a *solution
*in this forum...
And this is not primarily a statistical issue, it is an issue about the
Hausman test in the R environment. 

I cannot imagine, no one in this forum has ever done a Hausman test on OLS
regressions.
I read in the systemfit package and found only this example referring to
2SLS and 3SLS regressions: 

data( "Kmenta" )
eqDemand <- consump ~ price + income
eqSupply <- consump ~ price + farmPrice + trend
inst <- ~ income + farmPrice + trend
system <- list( demand = eqDemand, supply = eqSupply )
## perform the estimations
fit2sls <- systemfit( system, "2SLS", inst = inst, data = Kmenta )
fit3sls <- systemfit( system, "3SLS", inst = inst, data = Kmenta )
## perform the Hausman test
h <- hausman.systemfit( fit2sls, fit3sls )
print( h )




--
View this message in context: http://r.789695.n4.nabble.com/Hausman-test-in-R-tp4647716p4647774.html
Sent from the R help mailing list archive at Nabble.com.
#
On 29 October 2012 16:56, fxen3k <f.sehardt at gmail.com> wrote:
snip

If we are talking about the same test a Hausman test can not be
applied to OLS regressions.  As you have already been told you must
have two estimates of the same set of coefficients to do a Hausman
test.

Suppose that you do OLS  and an IV estimates of a particular
regression you will get twu estimates of the coefficients in the
model. If the disturbances are not correlated with the explanatory
variables (no endogeneity) the two sets of coefficients will no be
similar.  If there is endogeneity the coefficients will be different.
The Hausman test is a test of the null that the coefficients are not
different.   If the null is accepted you will probably accept the OLS
regression. If the null is rejected you may consider the IV estimate.

A Hausman test is applicable in many other situations (fixed v random
effects etc.)  You may have problems with the estimate of the
covariance matrix used in the test as on occasion as, due to numerical
problems, the estimates of that matrix are not always positive
definite.

Most intermediate level econometrics textbooks will have a good
account of the Hausman test. Green(2012), Econometric Analysis 7th
edition, Prentice Hall. contains a comprehensive discussion of these
matters which you might read.  It is not easy but if you master the
basic concepts there, your questions about their implementation in R
are likely to be answered on this forum.

Best Regards

John

  
    
#
Thanks for your answer, John!

Having read in Wooldridge, Verbeek and Hausman himself, I tried to figure
out how this whole Hausman test works.

I tried to figure out, if endogeneity exists in my particular case. So I did
this

Y ~ X + Z + Rest + error term [# this is the the original regression with Z
= instrumental variable for X, X = potentially endogenous variable and Rest
= more independent variables]
Regression 1:
X ~ Z + Rest + error term
Regression 2:
Y ~ X + Rest + residuals(Reg1) + error [# I took the residuals from
Regression 1 by Reg1_resid <- cbind(Red1$resid)

Finally, if the coefficient for the residuals is statistically significant,
there is endogeneity. 

Is this approach correct?

p.s: My p-value is 0.1138...

Thanks for your help





--
View this message in context: http://r.789695.n4.nabble.com/Hausman-test-in-R-tp4647716p4647800.html
Sent from the R help mailing list archive at Nabble.com.