How can I estimate deviance explained of a mixed gamm?

Hello Jon,

if I understand you correctly, you are looking for a metric like R^2 - 
"variation in the outcomes accounted for by the model". I don't have 
anything insightful to answer myself, but maybe this, by Douglas Bates, 
is relevant: 
http://marc.info/?l=r-sig-mixed-models&m=126719474831488&w=2

I quote:

"Assuming that one wants to define an R^2 measure, I think an argument
could be made for treating the penalized residual sum of squares from
a linear mixed model in the same way that we consider the residual sum
of squares from a linear model.  Or one could use just the residual
sum of squares without the penalty or the minimum residual sum of
squares obtainable from a given set of terms, which corresponds to an
infinite precision matrix.  I don't know, really.  It depends on what
you are trying to characterize.

In other words, what's the purpose?  What aspect of the R^2 for a
linear model are you trying to generalize?

I'm sorry if I sound argumentative but discussions like this sometimes
frustrate me.  A linear mixed model does not behave exactly like a
linear model without random effects so a measure that may be
appropriate for the linear model does not necessarily generalize.  I'm
not saying that this is the case but if the request is "I don't care
what the number means or if indeed it means anything at all, just give
me a number I can report", that's not the style of statistics I
practice.

I regard Bill Venables' wonderful unpublished paper "Exegeses on
Linear Models" (just put the name in a search engine to find a copy -
there is only one paper with "Exegeses" and "Linear Models" in the
title) as required reading for statisticians.  As Bill emphasizes in
that paper, statistics is not just a collection of formulas (many of
which are based on approximations).  It's about models and comparing
how well different models fit the observed data.  If we start with a
formula and only ask ourselves "How do we generalize this formula?"
we're missing the point.  We should start at the model.

In a linear model the R^2 statistic is a dimensionless comparison of
the quality of the current model fit, as measured by the residual sum
of squares, to the fit one would obtain from a trivial model.  When
the current model can be shown to contain a model with an intercept
term only (and whose coefficient will be estimated by the mean
response) then that model fit is the trivial model.  Otherwise the
trivial model is a prediction of zero for each response.  We know that
the trivial model will produce a greater residual sum of squares than
the current model fit because the models are nested.  The R^2 is the
proportion of variability not accounted for by the trivial model but
accounted for by the current model (my apologies to my grammar
teachers for having juxtaposed prepositions).

The interesting point there is that when you think of the
relationships between models you can determine how you handle the case
of a model that does not have an intercept term.  If you start from
the formula instead you can end up calculating a negative R^2 because
you compare models that are not nested.  Such nonsensical results are
often reported.  (I think it was the Mathematica documentation that
gave a careful explanation of why you get a negative R^2 instead of
recognizing that the formula they were using did not apply in certain
cases.)

It may be that there is a sensible measure of the quality of fit from
a linear mixed model that generalizes the R^2 from a linear model.  I
don't see an obvious candidate but I will freely admit that I haven't
thought much about the problem.  I would ask others who are thinking
about this to consider both the "what" and the "why".  George
Mallory's justification of "because it's there" for attempting to
climb Everest is perhaps a good justification for such endeavors
(Mallory may have questioned his rationale as he lay freezing to death
on the mountain).  I don't think it is a good justification for
manipulating formulas."

Best regards,
Michael

How can I estimate deviance explained of a mixed gamm?

Thread (3 messages)