Hi there, I am really new to statistics in R and statistics itself as well. My situation: I ran a lot of OLS regressions with different independent variables. (using the lm() function). After having done that, I know there is endogeneity due to omitted variables. (or perhaps due to any other reasons). And here comes the Hausman test. I know this test is used to identify endogeneity. But what I am not sure about is: "Can I use the Hausman test in a simple OLS regression or is it only possible in a 2SLS regression model?" "And if it is possible to use it, how can I do it?" Info about the data: data = lots of data :) x1 <- data$x1 x2 <- data$x2 x3 <- data$x3 x4 <- data$x4 y1 <- data$y1 reg1 <- summary(lm(y1 ~ x1 + x2 + x3 + x4)) Thanks in advance for any support! -- View this message in context: http://r.789695.n4.nabble.com/Hausman-test-in-R-tp4647716.html Sent from the R help mailing list archive at Nabble.com.
Hausman test in R
6 messages · Bert Gunter, Joshua Wiley, John C Frain +1 more
1. These are primarily statistics issues, not R issues. You should
post on a statistical help list like stats.stackexchange.com, not
here.
2. However, given your acknowledged statistical ignorance, you may be
asking for trouble. I suggest you seek help from a local statistical
expert to get you started. Then, depending on your statistical
background, you may understand enough to drive safely on your own.
Also try at the R command prompt:
install.packages("fortunes")
library(fortunes)
fortune("brain surgery")
Cheers,
Bert
On Sun, Oct 28, 2012 at 1:33 PM, fxen3k <f.sehardt at gmail.com> wrote:
Hi there, I am really new to statistics in R and statistics itself as well. My situation: I ran a lot of OLS regressions with different independent variables. (using the lm() function). After having done that, I know there is endogeneity due to omitted variables. (or perhaps due to any other reasons). And here comes the Hausman test. I know this test is used to identify endogeneity. But what I am not sure about is: "Can I use the Hausman test in a simple OLS regression or is it only possible in a 2SLS regression model?" "And if it is possible to use it, how can I do it?" Info about the data: data = lots of data :) x1 <- data$x1 x2 <- data$x2 x3 <- data$x3 x4 <- data$x4 y1 <- data$y1 reg1 <- summary(lm(y1 ~ x1 + x2 + x3 + x4)) Thanks in advance for any support! -- View this message in context: http://r.789695.n4.nabble.com/Hausman-test-in-R-tp4647716.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121028/bc05bc95/attachment.pl>
Given my "acknowledged statistical ignorance", I tried to find a *solution *in this forum... And this is not primarily a statistical issue, it is an issue about the Hausman test in the R environment. I cannot imagine, no one in this forum has ever done a Hausman test on OLS regressions. I read in the systemfit package and found only this example referring to 2SLS and 3SLS regressions: data( "Kmenta" ) eqDemand <- consump ~ price + income eqSupply <- consump ~ price + farmPrice + trend inst <- ~ income + farmPrice + trend system <- list( demand = eqDemand, supply = eqSupply ) ## perform the estimations fit2sls <- systemfit( system, "2SLS", inst = inst, data = Kmenta ) fit3sls <- systemfit( system, "3SLS", inst = inst, data = Kmenta ) ## perform the Hausman test h <- hausman.systemfit( fit2sls, fit3sls ) print( h ) -- View this message in context: http://r.789695.n4.nabble.com/Hausman-test-in-R-tp4647716p4647774.html Sent from the R help mailing list archive at Nabble.com.
On 29 October 2012 16:56, fxen3k <f.sehardt at gmail.com> wrote:
snip If we are talking about the same test a Hausman test can not be applied to OLS regressions. As you have already been told you must have two estimates of the same set of coefficients to do a Hausman test. Suppose that you do OLS and an IV estimates of a particular regression you will get twu estimates of the coefficients in the model. If the disturbances are not correlated with the explanatory variables (no endogeneity) the two sets of coefficients will no be similar. If there is endogeneity the coefficients will be different. The Hausman test is a test of the null that the coefficients are not different. If the null is accepted you will probably accept the OLS regression. If the null is rejected you may consider the IV estimate. A Hausman test is applicable in many other situations (fixed v random effects etc.) You may have problems with the estimate of the covariance matrix used in the test as on occasion as, due to numerical problems, the estimates of that matrix are not always positive definite. Most intermediate level econometrics textbooks will have a good account of the Hausman test. Green(2012), Econometric Analysis 7th edition, Prentice Hall. contains a comprehensive discussion of these matters which you might read. It is not easy but if you master the basic concepts there, your questions about their implementation in R are likely to be answered on this forum. Best Regards John
I cannot imagine, no one in this forum has ever done a Hausman test on OLS regressions. I read in the systemfit package and found only this example referring to 2SLS and 3SLS regressions: data( "Kmenta" ) eqDemand <- consump ~ price + income eqSupply <- consump ~ price + farmPrice + trend inst <- ~ income + farmPrice + trend system <- list( demand = eqDemand, supply = eqSupply ) ## perform the estimations fit2sls <- systemfit( system, "2SLS", inst = inst, data = Kmenta ) fit3sls <- systemfit( system, "3SLS", inst = inst, data = Kmenta ) ## perform the Hausman test h <- hausman.systemfit( fit2sls, fit3sls ) print( h ) -- View this message in context: http://r.789695.n4.nabble.com/Hausman-test-in-R-tp4647716p4647774.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
John C Frain Economics Department Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:frainj at tcd.ie mailto:frainj at gmail.com
Thanks for your answer, John! Having read in Wooldridge, Verbeek and Hausman himself, I tried to figure out how this whole Hausman test works. I tried to figure out, if endogeneity exists in my particular case. So I did this Y ~ X + Z + Rest + error term [# this is the the original regression with Z = instrumental variable for X, X = potentially endogenous variable and Rest = more independent variables] Regression 1: X ~ Z + Rest + error term Regression 2: Y ~ X + Rest + residuals(Reg1) + error [# I took the residuals from Regression 1 by Reg1_resid <- cbind(Red1$resid) Finally, if the coefficient for the residuals is statistically significant, there is endogeneity. Is this approach correct? p.s: My p-value is 0.1138... Thanks for your help -- View this message in context: http://r.789695.n4.nabble.com/Hausman-test-in-R-tp4647716p4647800.html Sent from the R help mailing list archive at Nabble.com.