Skip to content

Significance and lmer

5 messages · Ben Bolker, Adam D. I. Kramer, David Duffy

#
Dear colleagues,

Please consider this series of commands:

a <- lmer(log(stddiff+.1539) ~ pred + m*v + option + (option|studyID),
data=r1, subset=option>1, REML=FALSE)

b <- update(a, . ~ . - pred)

anova(a,b)

...am I mistaken in thinking that the latter command will produce a test of
whether "pred" is a significant predictor of log(stddiff+.1539)? I am
concerned because of the results:
Estimate   Std. Error    t value
(Intercept) -0.6608993664 0.1591862808 -4.1517357
pred         0.0879255592 0.1715599954  0.5125062
ml           0.0656916428 0.1173308419  0.5598838
vl          -0.0980204413 0.1276648229 -0.7677952
option       0.0003197903 0.0008134259  0.3931400
ml:vl       -0.1890574941 0.1710443092 -1.1053130

...note a t-value of 0.51 for this item...very small! ...but anova(a,b) produces this:

Models:
b: log(stddiff + 0.1539) ~ m + v + option + (option | studyID) +
b:     m:v
a: log(stddiff + 0.1539) ~ pred + m * v + option + (option | studyID)
   Df    AIC    BIC  logLik  Chisq Chi Df Pr(>Chisq)
b  9 3969.2 4019.1 -1975.6
a 10 3955.9 4011.2 -1967.9 15.345      1  8.954e-05 ***
---

...a significant result completely unrelated to the t-value. My
interpretation of this would be that we have no good evidence that the
estimate for 'pred' is nonzero, but including pred in the model improves
prediction.

I think I must be missing something here--I would appreciate anyone's input
on what that "something" is.

Cordially,
--
Adam D. I. Kramer
Ph.D. Candidate, Social Psychology
University of Oregon
adik-rhelp at ilovebacon.org
#
Adam D. I. Kramer <adik at ...> writes:
[snip]
It is possible for Wald tests (as provided by summary()) to 
disagree radically with likelihood ratio tests (look up "Hauck-Donner
effects", but my guess is that's not what's going
on here (it definitely can apply in binomial models, don't think
it should apply to LMMs but ?).

  I have seen some wonky stuff happen with update() [sorry, can't
provide any reproducible details], I would definitely try fitting
b by spelling out the full model rather than using update() and
see if that makes a difference.

  Other than that, nothing springs to mind.

  (Where does the log(x+0.1539) transformation come from???)
#
On Sat, 27 Mar 2010, Ben Bolker wrote:

            
There are no Wald tests produced by the summary()...my understanding from
reading this list is that the t-values are provided because they are t-like
(effect / se), but that it is difficult (and perhaps foolish) to estimate
degrees of freedom for t. So my concern is based on the fact that t is very
small.
This produces no difference in b's estimates or the anova() statistics.
(That said, I originally was fitting [implicitly] with REML=TRUE, which did
make a difference, but not a big one).
Well, thanks for the reply. Are you, then, of the opinion that the above
interpretation is reasonable?
x is power-law distributed with a bunch of zeroes (but not ordinal, or I'd
use family=poisson), and .1539 is the 25th percentile. This normalizes is
pretty well. Good question, though! And thanks ofr the response!

--Adam
#
On Sat, 27 Mar 2010, Adam D. I. Kramer wrote:
The two models both have the same number of observations, one hopes?  How 
many observations per studyID and how many studyIDs?
I would be a bit nervous.  My interpretation would be that the model is 
inappropriate for the data (as the Wald and LR tests should roughly agree 
for a LMM, as Ben pointed out), and would look at diagnostic plots of 
residuals etc.  The bunch of zeroes you mention may still be stuffing 
things up ;)  Is a left-censored model plausible?

Just my 2c, David Duffy.
#
The problem turned out to be, indeed, differing numbers of observations.
This is likely due to me relying too much on update() to work as I
expected...it did not drop the observations previously dropped. The help
page for update makes it very clear that it just re-evaluates an altered
call, so this is my fault. Ben's comment about update() being wonky should
have given me a hint.

Preselecting cases using complete.cases() for both models brought the t
values and chi-square values much closer together--when t=.51 for the
coefficient, the chisq of a likelihood test for removing the variable from
the model was chisq=.25, leading to a reasonable p=.62.

Thanks very much to you and Ben Bolker!

--Adam
On Sun, 28 Mar 2010, David Duffy wrote: