Unable to reproduce Stata Heckman sample selection estimates
Hi Arne, Thanks for the reply. I am using R version 2.14.0 and sampleSelection version 0.6.12. I estimate the model by the 1-step ML method. However, when I use the 2-step method, the standard errors are reported as NA. I use the selection() function, very basic call, something to the effect of: selection(selectionFormula, outcomeFormula, data = aDataFrame), where the formulas are very straightforward and basic as well, y ~ x1 + x2 + ... + xp. I have read the associated paper, which is where I got the idea to pass the coefficients from a seleciton object to the start argument. I will work on creating a minimal reproducible example; the dataset is large and confidential, the models long-ish. - Clara
On Friday, November 25, 2011 04:04:52 am Arne Henningsen wrote:
On 25 November 2011 04:37, Yuan Yuan <y.yuan at vt.edu> wrote:
Hello, I am working on reproducing someone's analysis which was done in Stata. The analysis is estimation of a standard Heckman sample selection model (Tobit-2), for which I am using the
sampleSelection
package and the selection() function. I have a few problems with
the
estimation: 1) The reported standard error for all estimates is Inf ... vcov(selectionObject) yields Inf in every cell. 2) While the selection equation coefficient estimates are almost exactly the same as the Stata results, the outcome equation coefficient estimates are quite different (different sign in one
case,
order of magnitude difference in some other cases). 3) I can't seem to figure out how to specify the initial values
for
the MLE ... whatever argument I pass to start (even of the form coef(selectionObject)), I get the following error: Error in gr[, fixed] <- NA : (subscript) logical subscript too
long
I have to admit I am pretty confused by #1, I feel like I must
be
doing something wrong, missing something obvious, but I have no
idea
what. I figure #2 might be because the algorithms (selection and Stata) are just finding different local maxima, but because of
#3 I
can't test that guess by using different initial values in
selection.
Let me know if I should provide any more information. Thanks in advance for any pointers in the right direction.
Yes, please provide more information (see also the posting guide
[1]),
e.g. which version of R and which version of the sampleSelection package are you using? Do you estimate the model by the two-step approach or by the 1-step maximum likelihood method? Which
commands
did use use? Can you send us a reproducible example? Have you read
the
paper about using the sampleSelection package [2]? [1] http://www.r-project.org/posting-guide.html [2] http://www.jstatsoft.org/v27/i07 Best wishes from copenhagen, Arne