An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20101102/f75fe42e/attachment.pl>
sandwich variance estimation using glmer?
17 messages · T.D. Rudolph, Doran, Harold, Dimitris Rizopoulos +6 more
Out of curiosity, why would you want a sandwich estimator from lmer? That estimator is typically used when the likelihood is misspecified, but you still want standard errors that account for correlations among units within a cluster. Since this is what lmer standard errors already account for, is there a need for the sandwich?
-----Original Message----- From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models- bounces at r-project.org] On Behalf Of Tyler Dean Rudolph Sent: Tuesday, November 02, 2010 4:41 PM To: r-sig-mixed-models at r-project.org Subject: [R-sig-ME] sandwich variance estimation using glmer? Are there any current functionalities in R that permit estimation of robust sandwich variances based on lmer (mixed model) objects?? I'm aware of the sandwich package and gee implementations but to my knowledge these are not yet compatible with mixed model objects. Apparently these are already implemented in SAS.... Tyler [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
On 11/3/2010 1:57 PM, Doran, Harold wrote:
Out of curiosity, why would you want a sandwich estimator from lmer? That estimator is typically used when the likelihood is misspecified, but you still want standard errors that account for correlations among units within a cluster. Since this is what lmer standard errors already account for, is there a need for the sandwich?
well, it is possible that the random-effects structure that you have specified is not the correct one (i.e., in order to fully account for the correlations), and in this case it makes sense to use the sandwich estimator (of course, the sandwich estimator has its own problems, but this is probably another discussion...) Best, Dimitris
-----Original Message----- From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models- bounces at r-project.org] On Behalf Of Tyler Dean Rudolph Sent: Tuesday, November 02, 2010 4:41 PM To: r-sig-mixed-models at r-project.org Subject: [R-sig-ME] sandwich variance estimation using glmer? Are there any current functionalities in R that permit estimation of robust sandwich variances based on lmer (mixed model) objects?? I'm aware of the sandwich package and gee implementations but to my knowledge these are not yet compatible with mixed model objects. Apparently these are already implemented in SAS.... Tyler [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20101103/8a335ac5/attachment.pl>
Tyler-- This question about "robust" (usually, heteroscedastically consistent) SE has come up before with respect to lme4 and lmer. For example, see the following thread: http://tolstoy.newcastle.edu.au/R/e3/help/07/10/0754.html I have a vague memory of seeing a post from Achim Zeileis (the author of the "sandwich" package) mentioning why it might be challenging to extend his methods to lmer, but I couldn't (quickly) find it. [BTW, you might be cautious about comparisons to SAS... R is a wonderful resource, but not universally guaranteed to do everything SAS can, and sometimes for good reasons. Moreover, it is the result of an incredible, collective generosity of time and expertise (ie, nobody is making money here...).] Hope that helps. cheers, Dave
Tyler wrote:
Indeed, in this case the correlation structure of the random effects is not fully appreciated or known, in which case the standard errors are likely underestimated. The use of sandwich estimators should render variance estimates, and therefore inference, somewhat more realistic. While this is currently possible with GEEs, that approach does not ask the same question as a GLMM (i.e. marginal or "population" estimates vs. conditional or "subject-specific" estimates). I used to think R updates were ahead of SAS upgrades in terms of new approaches but apparently that is often not the case. Does anyone have the know-how required to implement this in R, or is there something I'm still missing? Best, Tyler On Wed, Nov 3, 2010 at 9:12 AM, Dimitris Rizopoulos <
d.rizopoulos at erasmusmc.nl> wrote:
> On 11/3/2010 1:57 PM, Doran, Harold wrote:
> >> Out of curiosity, why would you want a sandwich estimator from lmer? That >> estimator is typically used when the likelihood is misspecified, but you >> still want standard errors that account for correlations among units within >> a cluster. >> >> Since this is what lmer standard errors already account for, is there a >> need for the sandwich? >> > > well, it is possible that the random-effects structure that you have > specified is not the correct one (i.e., in order to fully account for the > correlations), and in this case it makes sense to use the sandwich estimator > (of course, the sandwich estimator has its own problems, but this is > probably another discussion...) > > Best, > Dimitris > > > > -----Original Message----- >>> From: r-sig-mixed-models-bounces at r-project.org [mailto: >>> r-sig-mixed-models- >>> bounces at r-project.org] On Behalf Of Tyler Dean Rudolph >>> Sent: Tuesday, November 02, 2010 4:41 PM >>> To: r-sig-mixed-models at r-project.org >>> Subject: [R-sig-ME] sandwich variance estimation using glmer? >>> >>> Are there any current functionalities in R that permit estimation of >>> robust >>> sandwich variances based on lmer (mixed model) objects?? I'm aware of >>> the >>> sandwich package and gee implementations but to my knowledge these are >>> not >>> yet compatible with mixed model objects. >>> >>> Apparently these are already implemented in SAS.... >>> >>> Tyler >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> R-sig-mixed-models at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models >>> >> >> _______________________________________________ >> R-sig-mixed-models at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models >> >> > -- > Dimitris Rizopoulos > Assistant Professor > Department of Biostatistics > Erasmus University Medical Center > > Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands > Tel: +31/(0)10/7043478 > Fax: +31/(0)10/7043014 > Web: http://www.erasmusmc.nl/biostatistiek/ >
Dave Atkins, PhD Research Associate Professor Department of Psychiatry and Behavioral Science University of Washington datkins at u.washington.edu Center for the Study of Health and Risk Behaviors (CSHRB) 1100 NE 45th Street, Suite 300 Seattle, WA 98105 206-616-3879 http://depts.washington.edu/cshrb/ (Mon-Wed) Center for Healthcare Improvement, for Addictions, Mental Illness, Medically Vulnerable Populations (CHAMMP) 325 9th Avenue, 2HH-15 Box 359911 Seattle, WA 98104 http://www.chammp.org (Thurs)
Hi Tyler, I think that there's something that you're missing. R is not motivated by comparisons with SAS or any package. So, your impression that R was ahead of SAS or behind SAS is mistaken, or at least, it's your impression, so you are responsible for it. R responds exactly to the community's needs because the community supports it. If the functionality that you want isn't there, it's because noone else has wanted it badly enough to a) code it up, or b) pay someone else to code it up. If you want that function, and you know that SAS has it, then use SAS. If you want to use that function in R, then see the above two points. Good luck, Andrew
On Wed, Nov 03, 2010 at 04:04:23PM -0400, Tyler Dean Rudolph wrote:
Indeed, in this case the correlation structure of the random effects is not fully appreciated or known, in which case the standard errors are likely underestimated. The use of sandwich estimators should render variance estimates, and therefore inference, somewhat more realistic. While this is currently possible with GEEs, that approach does not ask the same question as a GLMM (i.e. marginal or "population" estimates vs. conditional or "subject-specific" estimates). I used to think R updates were ahead of SAS upgrades in terms of new approaches but apparently that is often not the case. Does anyone have the know-how required to implement this in R, or is there something I'm still missing? Best, Tyler On Wed, Nov 3, 2010 at 9:12 AM, Dimitris Rizopoulos < d.rizopoulos at erasmusmc.nl> wrote:
On 11/3/2010 1:57 PM, Doran, Harold wrote:
Out of curiosity, why would you want a sandwich estimator from lmer? That estimator is typically used when the likelihood is misspecified, but you still want standard errors that account for correlations among units within a cluster. Since this is what lmer standard errors already account for, is there a need for the sandwich?
well, it is possible that the random-effects structure that you have specified is not the correct one (i.e., in order to fully account for the correlations), and in this case it makes sense to use the sandwich estimator (of course, the sandwich estimator has its own problems, but this is probably another discussion...) Best, Dimitris -----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org [mailto:
r-sig-mixed-models-
bounces at r-project.org] On Behalf Of Tyler Dean Rudolph
Sent: Tuesday, November 02, 2010 4:41 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] sandwich variance estimation using glmer?
Are there any current functionalities in R that permit estimation of
robust
sandwich variances based on lmer (mixed model) objects?? I'm aware of
the
sandwich package and gee implementations but to my knowledge these are
not
yet compatible with mixed model objects.
Apparently these are already implemented in SAS....
Tyler
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Andrew Robinson Program Manager, ACERA Department of Mathematics and Statistics Tel: +61-3-8344-6410 University of Melbourne, VIC 3010 Australia (prefer email) http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599 http://www.acera.unimelb.edu.au/
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20101103/beaac9a8/attachment.pl>
On 4/11/2010, at 10:49 AM, Andrew Robinson wrote:
<SNIP>
....... If the functionality that you want isn't there, it's because noone else has wanted it badly enough to a) code it up, or b) pay someone else to code it up.
<SNIP> There is another possibility that should be taken into consideration. The person wanting that functionality may lack both the requisite skills and insight to code up the functionality his or herself and the financial resources to pay someone else to do this. Yet another possibility is that the person responsible for the original code simply never thought of including the requested functionality. I am sure I've seen instances where people have asked (on the R-help list) for features in specific packages, and the maintainers of those packages have responded ``Yes, that's a good idea. I'll include that in the next release.'' Can't give specific examples off the top of my head. However I know for certain that I have accommodated such feature requests from users in respect of packages that I maintain (specifically Iso and deldir). cheers, Rolf Turner
Hi Tyler, I guess that it is a sensitive topic. This is a community of volunteers. Specious comparisons with commercial software products aren't helpful, unless they're *positive* specious comparisons ;). The other problem with working in a community is that it's very difficult for any one person to definitively state that functionality does not exist and is not presently being worked on. So, I repeat, good luck .... Andrew
On Wed, Nov 03, 2010 at 06:06:43PM -0400, Tyler Dean Rudolph wrote:
Hi Andrew,
Unfortunately I do not have access to SAS, so that is simply not an option
for me, though I do welcome your clarification. If this is a sensitive
topic perhaps I will abstain from mentioning it in future, but to me it
was a simple observation and not a value statement requiring
qualification.
Perhaps I should put this another way: can anyone confirm that this
functionality does NOT exist or is NOT presently being worked on somewhere
within the R sphere?
Tyler
On Wed, Nov 3, 2010 at 5:49 PM, Andrew Robinson
<[1]A.Robinson at ms.unimelb.edu.au> wrote:
Hi Tyler,
I think that there's something that you're missing.
R is not motivated by comparisons with SAS or any package. So, your
impression that R was ahead of SAS or behind SAS is mistaken, or at
least, it's your impression, so you are responsible for it. R
responds exactly to the community's needs because the community
supports it. If the functionality that you want isn't there, it's
because noone else has wanted it badly enough to
a) code it up, or
b) pay someone else to code it up.
If you want that function, and you know that SAS has it, then use SAS.
If you want to use that function in R, then see the above two points.
Good luck,
Andrew
On Wed, Nov 03, 2010 at 04:04:23PM -0400, Tyler Dean Rudolph wrote:
> Indeed, in this case the correlation structure of the random effects
is not
> fully appreciated or known, in which case the standard errors are
likely
> underestimated. The use of sandwich estimators should render variance
> estimates, and therefore inference, somewhat more realistic. While
this is
> currently possible with GEEs, that approach does not ask the same
question
> as a GLMM (i.e. marginal or "population" estimates vs. conditional or
> "subject-specific" estimates).
>
> I used to think R updates were ahead of SAS upgrades in terms of new
> approaches but apparently that is often not the case. Does anyone have
the
> know-how required to implement this in R, or is there something I'm
still
> missing?
>
> Best,
> Tyler
>
>
> On Wed, Nov 3, 2010 at 9:12 AM, Dimitris Rizopoulos <
> [2]d.rizopoulos at erasmusmc.nl> wrote:
>
> > On 11/3/2010 1:57 PM, Doran, Harold wrote:
> >
> >> Out of curiosity, why would you want a sandwich estimator from
lmer? That
> >> estimator is typically used when the likelihood is misspecified,
but you
> >> still want standard errors that account for correlations among
units within
> >> a cluster.
> >>
> >> Since this is what lmer standard errors already account for, is
there a
> >> need for the sandwich?
> >>
> >
> > well, it is possible that the random-effects structure that you have
> > specified is not the correct one (i.e., in order to fully account
for the
> > correlations), and in this case it makes sense to use the sandwich
estimator
> > (of course, the sandwich estimator has its own problems, but this is
> > probably another discussion...)
> >
> > Best,
> > Dimitris
> >
> >
> >
> > -----Original Message-----
> >>> From: [3]r-sig-mixed-models-bounces at r-project.org [mailto:
> >>> r-sig-mixed-models-
> >>> [4]bounces at r-project.org] On Behalf Of Tyler Dean Rudolph
> >>> Sent: Tuesday, November 02, 2010 4:41 PM
> >>> To: [5]r-sig-mixed-models at r-project.org
> >>> Subject: [R-sig-ME] sandwich variance estimation using glmer?
> >>>
> >>> Are there any current functionalities in R that permit estimation
of
> >>> robust
> >>> sandwich variances based on lmer (mixed model) objects?? I'm aware
of
> >>> the
> >>> sandwich package and gee implementations but to my knowledge these
are
> >>> not
> >>> yet compatible with mixed model objects.
> >>>
> >>> Apparently these are already implemented in SAS....
> >>>
> >>> Tyler
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> _______________________________________________
> >>> [6]R-sig-mixed-models at r-project.org mailing list
> >>> [7]https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>>
> >>
> >> _______________________________________________
> >> [8]R-sig-mixed-models at r-project.org mailing list
> >> [9]https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>
> >>
> > --
> > Dimitris Rizopoulos
> > Assistant Professor
> > Department of Biostatistics
> > Erasmus University Medical Center
> >
> > Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
> > Tel: +31/(0)10/7043478
> > Fax: +31/(0)10/7043014
> > Web: [10]http://www.erasmusmc.nl/biostatistiek/
> >
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> [11]R-sig-mixed-models at r-project.org mailing list
> [12]https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Andrew Robinson
Program Manager, ACERA
Department of Mathematics and Statistics Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia (prefer email)
[13]http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599
[14]http://www.acera.unimelb.edu.au/
References
Visible links
1. mailto:A.Robinson at ms.unimelb.edu.au
2. mailto:d.rizopoulos at erasmusmc.nl
3. mailto:r-sig-mixed-models-bounces at r-project.org
4. mailto:bounces at r-project.org
5. mailto:r-sig-mixed-models at r-project.org
6. mailto:R-sig-mixed-models at r-project.org
7. https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
8. mailto:R-sig-mixed-models at r-project.org
9. https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
10. http://www.erasmusmc.nl/biostatistiek/
11. mailto:R-sig-mixed-models at r-project.org
12. https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
13. http://www.ms.unimelb.edu.au/~andrewpr
14. http://www.acera.unimelb.edu.au/
Andrew Robinson Program Manager, ACERA Department of Mathematics and Statistics Tel: +61-3-8344-6410 University of Melbourne, VIC 3010 Australia (prefer email) http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599 http://www.acera.unimelb.edu.au/
Let me push on this just a bit to spark further discussion. The OP was interested in robust standard errors given misspecification in the likelihood. So, one possible avenue was to explore Huber-White standard errors, or the sandwich estimator, to account for this misspecification and obtain "better" standard errors, but still use the point estimates of the fixed effects as given. Some discussion on this has noted that the misspecification occurs in many ways, sometimes given that distributional assumptions were not met. Let's assume someone was willing and skilled to code up the HW as a possible solution within lmer to account for not meeting certain distributional assumptions. My question is now why not directly code up models that permit for different distributional assumptions, such as t-distributions of residuals (random effects) or whatever the case might be? In other words, why not write code that addresses the problems directly (misspecification of the likelihood) rather than focusing on HW estimates. Isn't it a better use of time and energy to focus on properly specifying the likelihood and estimating parameters from that model rather than HW?
-----Original Message----- From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models- bounces at r-project.org] On Behalf Of Tyler Dean Rudolph Sent: Wednesday, November 03, 2010 6:07 PM To: Andrew Robinson Cc: r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] sandwich variance estimation using glmer? Hi Andrew, Unfortunately I do not have access to SAS, so that is simply not an option for me, though I do welcome your clarification. If this is a sensitive topic perhaps I will abstain from mentioning it in future, but to me it was a simple observation and not a value statement requiring qualification. Perhaps I should put this another way: can anyone confirm that this functionality does NOT exist or is NOT presently being worked on somewhere within the R sphere? Tyler On Wed, Nov 3, 2010 at 5:49 PM, Andrew Robinson < A.Robinson at ms.unimelb.edu.au> wrote:
Hi Tyler, I think that there's something that you're missing. R is not motivated by comparisons with SAS or any package. So, your impression that R was ahead of SAS or behind SAS is mistaken, or at least, it's your impression, so you are responsible for it. R responds exactly to the community's needs because the community supports it. If the functionality that you want isn't there, it's because noone else has wanted it badly enough to a) code it up, or b) pay someone else to code it up. If you want that function, and you know that SAS has it, then use SAS. If you want to use that function in R, then see the above two points. Good luck, Andrew On Wed, Nov 03, 2010 at 04:04:23PM -0400, Tyler Dean Rudolph wrote:
Indeed, in this case the correlation structure of the random effects is
not
fully appreciated or known, in which case the standard errors are likely underestimated. The use of sandwich estimators should render variance estimates, and therefore inference, somewhat more realistic. While this
is
currently possible with GEEs, that approach does not ask the same
question
as a GLMM (i.e. marginal or "population" estimates vs. conditional or "subject-specific" estimates). I used to think R updates were ahead of SAS upgrades in terms of new approaches but apparently that is often not the case. Does anyone have
the
know-how required to implement this in R, or is there something I'm still missing? Best, Tyler On Wed, Nov 3, 2010 at 9:12 AM, Dimitris Rizopoulos < d.rizopoulos at erasmusmc.nl> wrote:
On 11/3/2010 1:57 PM, Doran, Harold wrote:
Out of curiosity, why would you want a sandwich estimator from lmer?
That
estimator is typically used when the likelihood is misspecified, but
you
still want standard errors that account for correlations among units
within
a cluster. Since this is what lmer standard errors already account for, is there
a
need for the sandwich?
well, it is possible that the random-effects structure that you have specified is not the correct one (i.e., in order to fully account for
the
correlations), and in this case it makes sense to use the sandwich
estimator
(of course, the sandwich estimator has its own problems, but this is probably another discussion...) Best, Dimitris -----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org [mailto: r-sig-mixed-models- bounces at r-project.org] On Behalf Of Tyler Dean Rudolph Sent: Tuesday, November 02, 2010 4:41 PM To: r-sig-mixed-models at r-project.org Subject: [R-sig-ME] sandwich variance estimation using glmer? Are there any current functionalities in R that permit estimation of robust sandwich variances based on lmer (mixed model) objects?? I'm aware
of
the sandwich package and gee implementations but to my knowledge these
are
not
yet compatible with mixed model objects.
Apparently these are already implemented in SAS....
Tyler
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Andrew Robinson Program Manager, ACERA Department of Mathematics and Statistics Tel: +61-3-8344-6410 University of Melbourne, VIC 3010 Australia (prefer email) http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599 http://www.acera.unimelb.edu.au/
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
On 10-11-04 10:18 AM, Doran, Harold wrote:
Let me push on this just a bit to spark further discussion. The OP was interested in robust standard errors given misspecification in the likelihood. So, one possible avenue was to explore Huber-White standard errors, or the sandwich estimator, to account for this misspecification and obtain "better" standard errors, but still use the point estimates of the fixed effects as given. Some discussion on this has noted that the misspecification occurs in many ways, sometimes given that distributional assumptions were not met. Let's assume someone was willing and skilled to code up the HW as a possible solution within lmer to account for not meeting certain distributional assumptions. My question is now why not directly code up models that permit for different distributional assumptions, such as t-distributions of residuals (random effects) or whatever the case might be? In other words, why not write code that addresses the problems directly (misspecification of the likelihood) rather than focusing on HW estimates. Isn't it a better use of time and energy to focus on properly specifying the likelihood and estimating parameters from that model rather than HW?
I would say there are two issues here, one philosophical and one
technical/pragmatic.
Philosophical: how should one divide one's time and effort between
trying to come up with better parametric models vs. admitting that 'all
models are wrong' and coming up instead with robust alternatives that
perform reasonably well across a broad range of (unknown/unspecified)
parametric models?
Technical/pragmatic: taking your case of t-distributions of residuals
-- this could be reasonably straightforward to implement in a framework
that is closer to a brute force computational solution (MCMC, AD Model
Builder), but I wouldn't know where to begin trying to implement it in
the elegant/computationally efficient lme4 framework ...
cheers
Ben Bolker
One slightly different perspective on robust SE in mixed models: The place where I have seen these used regularly is in the HLM software (popular in education and psychology circles). HLM *always* reports both "standard" and robust SE. What I find interesting is that if you read Raudenbush and Bryk (and the HLM manual), they suggest using the robust SE as a model diagnostic (my term). That is, when there is a discrepancy between SE, they rightly note that something is amiss, and you should do further detective work related to the random-effects specification. That seems like a very valid use of robust SE, though I fully acknowledge such info (ie, model isn't fitting well) could be got other ways. [BTW, I'd love to see other robust approaches, such as t-distributed error and/or priors, but as Ben notes that's an awfully high bar to implement -- either in lmer or MCMCglmm. The heavy package is an initial attempt, but seems to be "stalled out" at the moment.] For what it's worth. cheers, Dave
Harold wrote:
Let me push on this just a bit to spark further discussion. The OP was interested in robust standard errors given misspecification in the likelihood. So, one possible avenue was to explore Huber-White standard errors, or the sandwich estimator, to account for this misspecification and obtain "better" standard errors, but still use the point estimates of the fixed effects as given. Some discussion on this has noted that the misspecification occurs in many ways, sometimes given that distributional assumptions were not met. Let's assume someone was willing and skilled to code up the HW as a possible solution within lmer to account for not meeting certain distributional assumptions. My question is now why not directly code up models that permit for different distributional assumptions, such as t-distributions of residuals (random effects) or whatever the case might be? In other words, why not write code that addresses the problems directly (misspecification of the likelihood) rather than focusing on HW estimates. Isn't it a better use of time and energy to focus on properly specifying the likelihood and estimating parameters from that model rather than HW?
Dave Atkins, PhD Research Associate Professor Department of Psychiatry and Behavioral Science University of Washington datkins at u.washington.edu Center for the Study of Health and Risk Behaviors (CSHRB) 1100 NE 45th Street, Suite 300 Seattle, WA 98105 206-616-3879 http://depts.washington.edu/cshrb/ (Mon-Wed) Center for Healthcare Improvement, for Addictions, Mental Illness, Medically Vulnerable Populations (CHAMMP) 325 9th Avenue, 2HH-15 Box 359911 Seattle, WA 98104 http://www.chammp.org (Thurs)
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20101104/8bad50f1/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20101105/d043e095/attachment.pl>
On 11/4/10 3:06 PM, Andrew Robinson wrote:
That's an interesting point, Dave. Surely, however, there are better model diagnostics? By 'better' I mean more likely to pinpoint the source of lack of fit within a model? Personally I wouldn't rely on the concordance between robust and asymptotic SE as an indicator of model appropriateness. I don't see how it adds anything important to the diagnostic process. 1) if the match is bad, I'll examine the diagnostics, but 2) if the match is good, I'll examine the diagnostics anyway, as part of due diligence.
Andrew-- You are, of course, absolutely correct. I use R almost exclusively, and for the most part, not having robust SE doesn't really cause any problems for me (in general, or with respect to checking diagnostics... actually, I love it!). From my own experience, I have seen the HLM's software use of both be helpful in the following way: I have several times had colleagues / students come to me with output from HLM, noting the discrepancy between SE and wondering what this means? In essence, b/c that software puts both of them in front of you, it's hard to miss when they are different (esp. since p-values will then be radically different). These times tend to be "teachable moments" in terms of what it is saying about the model and data (and fit). More often that not, the outcome has been some form of count variable (which lurk quite commonly in psychological/psychiatric waters), and thus we can talk about distributional properties of the outcome vs. model, etc. (And I think the mean-variance relationship of the Poisson tends to lead to large adjustments with the robust SE.) I would in no way suggest that lme4 should follow suit with the HLM software, and I honestly doubt whether it would serve a similar purpose, as more novice statistical users tend to be intimidated by R (given lack of menus, GUI, etc.). As I mentioned before, what I *would* be interested in are robust approaches a la incorporating t-distribution in prior or likelihood. I am currently working with a colleague on an analysis of a small number of groups, where we have discused this -- he's exploring WinBUGS as an option at present (to follow-up on our analyses with glmer and MCMCglmm). For what it's worth (and realizing it ain't all that much about R...). cheers, Dave
Also I don't know of any theory that suggests that deviation between
robust and asymptotic standard error estimates *always* indicates a
model problem. If anyone does, I'm happy to learn about it.
On the other hand, robust SE are (likely to be) larger than asymptotic
SE. So, if I see a deviation, it could be because (a) there's a problem
with my model, and (b) the model is fine but I'm paying a price for
robustness.
All of which raises an interesting question: if we observe a deviation
between the estimate and hunt through the model diagnostics to find a
cause, but can't --- then what do we do?
Cheers
Andrew
On Fri, Nov 5, 2010 at 2:59 AM, David Atkins <datkins at u.washington.edu
<mailto:datkins at u.washington.edu>> wrote:
One slightly different perspective on robust SE in mixed models:
The place where I have seen these used regularly is in the HLM
software (popular in education and psychology circles). HLM
*always* reports both "standard" and robust SE.
What I find interesting is that if you read Raudenbush and Bryk (and
the HLM manual), they suggest using the robust SE as a model
diagnostic (my term). That is, when there is a discrepancy between
SE, they rightly note that something is amiss, and you should do
further detective work related to the random-effects specification.
That seems like a very valid use of robust SE, though I fully
acknowledge such info (ie, model isn't fitting well) could be got
other ways.
[BTW, I'd love to see other robust approaches, such as t-distributed
error and/or priors, but as Ben notes that's an awfully high bar to
implement -- either in lmer or MCMCglmm. The heavy package is an
initial attempt, but seems to be "stalled out" at the moment.]
For what it's worth.
cheers, Dave
Harold wrote:
Let me push on this just a bit to spark further discussion. The OP
was interested in robust standard errors given misspecification in
the likelihood. So, one possible avenue was to explore Huber-White
standard errors, or the sandwich estimator, to account for this
misspecification and obtain "better" standard errors, but still use
the point estimates of the fixed effects as given.
Some discussion on this has noted that the misspecification occurs
in many ways, sometimes given that distributional assumptions were
not met. Let's assume someone was willing and skilled to code up the
HW as a possible solution within lmer to account for not meeting
certain distributional assumptions.
My question is now why not directly code up models that permit for
different distributional assumptions, such as t-distributions of
residuals (random effects) or whatever the case might be? In other
words, why not write code that addresses the problems directly
(misspecification of the likelihood) rather than focusing on HW
estimates.
Isn't it a better use of time and energy to focus on properly
specifying the likelihood and estimating parameters from that model
rather than HW?
--
Dave Atkins, PhD
Research Associate Professor
Department of Psychiatry and Behavioral Science
University of Washington
datkins at u.washington.edu <mailto:datkins at u.washington.edu>
Center for the Study of Health and Risk Behaviors (CSHRB)
1100 NE 45th Street, Suite 300
Seattle, WA 98105
206-616-3879
http://depts.washington.edu/cshrb/
(Mon-Wed)
Center for Healthcare Improvement, for Addictions, Mental Illness,
Medically Vulnerable Populations (CHAMMP)
325 9th Avenue, 2HH-15
Box 359911
Seattle, WA 98104
http://www.chammp.org
(Thurs)
_______________________________________________
R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Andrew Robinson
Program Manager, ACERA
Senior Lecturer in Applied Statistics Tel:
+61-3-8344-6410
Department of Mathematics and Statistics Fax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robinson at ms.unimelb.edu.au
<mailto:a.robinson at ms.unimelb.edu.au> Website:
http://www.ms.unimelb.edu.au
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10-11-05 12:12 PM, David Atkins wrote:
On 11/4/10 3:06 PM, Andrew Robinson wrote:
That's an interesting point, Dave. Surely, however, there are better model diagnostics? By 'better' I mean more likely to pinpoint the source of lack of fit within a model? Personally I wouldn't rely on the concordance between robust and asymptotic SE as an indicator of model appropriateness. I don't see how it adds anything important to the diagnostic process. 1) if the match is bad, I'll examine the diagnostics, but 2) if the match is good, I'll examine the diagnostics anyway, as part of due diligence.
Andrew-- You are, of course, absolutely correct. I use R almost exclusively, and for the most part, not having robust SE doesn't really cause any problems for me (in general, or with respect to checking diagnostics... actually, I love it!). From my own experience, I have seen the HLM's software use of both be helpful in the following way: I have several times had colleagues / students come to me with output from HLM, noting the discrepancy between SE and wondering what this means? In essence, b/c that software puts both of them in front of you, it's hard to miss when they are different (esp. since p-values will then be radically different). These times tend to be "teachable moments" in terms of what it is saying about the model and data (and fit). More often that not, the outcome has been some form of count variable (which lurk quite commonly in psychological/psychiatric waters), and thus we can talk about distributional properties of the outcome vs. model, etc. (And I think the mean-variance relationship of the Poisson tends to lead to large adjustments with the robust SE.) I would in no way suggest that lme4 should follow suit with the HLM software, and I honestly doubt whether it would serve a similar purpose, as more novice statistical users tend to be intimidated by R (given lack of menus, GUI, etc.). As I mentioned before, what I *would* be interested in are robust approaches a la incorporating t-distribution in prior or likelihood. I am currently working with a colleague on an analysis of a small number of groups, where we have discused this -- he's exploring WinBUGS as an option at present (to follow-up on our analyses with glmer and MCMCglmm). For what it's worth (and realizing it ain't all that much about R...).
Channeling Dave Fournier here for a few moments ... For that particular project, I would recommend that you give AD Model Builder and (to plug my own project and make things a little more R-ish) the R2ADMB package a try. AD Model Builder is fast, flexible, uses Laplace approximation for mixed modeling, and allows you to do post-hoc MCMC sampling around the estimated modes; changing from a Gaussian to t-distributed residual/likelihood calculations is in principle just a matter of adding a parameter and changing the line of code that does the likelihood calculation. And ADMB is now open source. The downside is that you have to learn a new system (objective functions in ADMB are written in a superset of C++). Overall I would say that the effort, and the payoff, of learning ADMB are on the same order of magnitude as learning WinBUGS/JAGS ... Ben Bolker -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzUL+wACgkQc5UpGjwzenPStQCglIPjBsjlEFsea5CjUFugaIFW q7sAn1M78DGVTEctM/VETxBu/HdrrDj6 =fTTC -----END PGP SIGNATURE-----
Do you have a winbugs example of the model you would like to fit? If it is as simple as using a t-dist prior for your coefficients, then it's pretty simple to do that in winbugs or PyMC. Of course I'd like to plug my own project which is a c++ equivalent of winbugs, now directly callable from R via inline. I'm happy to translate a winbugs model and post the cppbugs equivalent. -Whit
On Fri, Nov 5, 2010 at 12:25 PM, Ben Bolker <bbolker at gmail.com> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10-11-05 12:12 PM, David Atkins wrote:
On 11/4/10 3:06 PM, Andrew Robinson wrote:
That's an interesting point, Dave. ?Surely, however, there are better model diagnostics? ?By 'better' I mean more likely to pinpoint the source of lack of fit within a model? Personally I wouldn't rely on the concordance between robust and asymptotic SE as an indicator of model appropriateness. ?I don't see how it adds anything important to the diagnostic process. 1) if the match is bad, I'll examine the diagnostics, but 2) if the match is good, I'll examine the diagnostics anyway, as part of due diligence.
Andrew-- You are, of course, absolutely correct. ?I use R almost exclusively, and for the most part, not having robust SE doesn't really cause any problems for me (in general, or with respect to checking diagnostics... actually, I love it!). From my own experience, I have seen the HLM's software use of both be helpful in the following way: I have several times had colleagues / students come to me with output from HLM, noting the discrepancy between SE and wondering what this means? ?In essence, b/c that software puts both of them in front of you, it's hard to miss when they are different (esp. since p-values will then be radically different). These times tend to be "teachable moments" in terms of what it is saying about the model and data (and fit). ?More often that not, the outcome has been some form of count variable (which lurk quite commonly in psychological/psychiatric waters), and thus we can talk about distributional properties of the outcome vs. model, etc. (And I think the mean-variance relationship of the Poisson tends to lead to large adjustments with the robust SE.) I would in no way suggest that lme4 should follow suit with the HLM software, and I honestly doubt whether it would serve a similar purpose, as more novice statistical users tend to be intimidated by R (given lack of menus, GUI, etc.). As I mentioned before, what I *would* be interested in are robust approaches a la incorporating t-distribution in prior or likelihood. ?I am currently working with a colleague on an analysis of a small number of groups, where we have discused this -- he's exploring WinBUGS as an option at present (to follow-up on our analyses with glmer and MCMCglmm). For what it's worth (and realizing it ain't all that much about R...).
?Channeling Dave Fournier here for a few moments ... ?For that particular project, I would recommend that you give AD Model Builder and (to plug my own project and make things a little more R-ish) the R2ADMB package a try. AD Model Builder is fast, flexible, uses Laplace approximation for mixed modeling, and allows you to do post-hoc MCMC sampling around the estimated modes; changing from a Gaussian to t-distributed residual/likelihood calculations is in principle just a matter of adding a parameter and changing the line of code that does the likelihood calculation. ?And ADMB is now open source. ?The downside is that you have to learn a new system (objective functions in ADMB are written in a superset of C++). Overall I would say that the effort, and the payoff, of learning ADMB are on the same order of magnitude as learning WinBUGS/JAGS ... ?Ben Bolker -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzUL+wACgkQc5UpGjwzenPStQCglIPjBsjlEFsea5CjUFugaIFW q7sAn1M78DGVTEctM/VETxBu/HdrrDj6 =fTTC -----END PGP SIGNATURE-----
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models