This looks reasonable to me (at least following Hardin and Hilbe's
definition of the score residual). For the Poisson distribution, it
turns out that the score residuals and the raw residuals are the same
(because (1/d(eta)/d(mu))) = Var(mu) in this case).
Here's a function (not thoroughly tested) that computes score
residuals more generally, at the price of using some information about
the internal structure of the object -- it's a fairly little translation
of Hardin and Hilbe's definition.
scoreResids <- function(object,...) {
ff <- object at resp$family
eta.mu <- 1/ff$mu.eta(predict(object))
y <- object at resp$y
mu <- object at resp$mu
(y-mu)/ff$var(mu)/eta.mu
}
Otherwise, though, I haven't had the time to work through to see why
working residuals are used in estfun.glm. I'm guessing that the issue
is actually a difference in the lme4:::weights.merMod function, which
doesn't actually have a "working" option (but the option is silently
ignored !!) -- I think that if one implemented 'working' rather than
'prior' weights, that this would probably work in parallel with
estfun.glm.
I agree that your functions will be very useful.
Ben Bolker
On 14-02-17 05:02 AM, Theodore Lytras wrote:
Dear list,
I am a PhD student in epidemiology, trying to analyze a multi-site cohort
study with R. I have a series of mixed-effects Poisson model objects
fitted
with glmer(), and I am trying to find a way to calculate Huber-White
robust
standard errors for these.
The only program I found that calculates Huber-White SEs for mixed-model
fits is GLLAMM in Stata, so I would like to reproduce this functionality
in R.
I took as a starting point the package "sandwich" by Achim Zeileis, which
unfortunately provides a sandwich covariance matrix only for glm objects.
So I tried to modify his code (functions estfun.glm(), bread.glm() and
sandwich()) and the function robust.se() found in [1] to make it work
with glmer objects, simplifying things as much as possible (focusing
only on Poisson fits).
Thus I came up with the code in [2]. Function robustSEglmm() takes a
glmer fit with family="Poisson" and the associated clustering variable
and tries to compute Huber-White standard errors.
What I found was the following: if estfunGlmm() uses the "working"
residuals (provided by residuals.merMod()), as is the case in the
original estfun.glm() and explained in the "sandwich" package
documentation, the resulting SEs are far too large, 6-10 times as those
produced by GLLAMM.
However, after reading a few things about the various types of residuals
in
[3], I tried replacing the working residuals in estfunGlmm() with "score"
residuals, and to my surprise I obtained SEs very close to those returned
by GLLAMM (?1%).
So my question is: is this approach statistically correct and valid?
Since I can't say I really understand the maths involved (I'm a medical
doctor by profession), I turned to Achim Zeileis for help. Cutting and
pasting from his answer:
[quote]
Well, one needs to work out the "right" residuals: The estfun() needs to
be the derivative of the log-likelihood with respect to the parameters.
And the bread() essentially the second derivative.
Given that the derivative of the log-likelihood is also called the score
function, it may well be that the "score residuals" that you mention
above
are what you are looking for. But I'm not familiar enough with the model
classes and their implementation in R to know for sure.
[/quote]
So, could someone more familiar with the internals of lme4 shed more
light on the issue? If this approach to calculate Huber-White standard
error is indeed correct, I guess it would be useful to many people
beside myself (I've seen this discussed before on this list) and would
extend the capabilities of R.
Thank you all in advance,
Theodore Lytras
[1] http://diffuseprior.wordpress.com/2013/01/12/the-cluster-bootstrap/
[2] https://gist.github.com/anonymous/9046934
[3]
http://books.google.com/books?id=tOeqO6Hs-6gC&lpg=PA53&vq=residuals&hl=el
&pg=PA55#v=onepage&q&f=false