lmer: No significant coefficients, but significant improvement of model fit?
Dear Ben, thank you very, very much for your extensive answer! I have run both models you suggest, and they fit equally well (Chi^2 = .99). In addition, the effect of each predictor is the same in both models (which makes sense of course). I will just look at each effects' significance level, as per your worries regarding the number of schools (i.e. 8 is not many school). Before endeavoring more interpretation, though, I will first read up on mixed models using the valuable suggestions given earlier :-) Again, thank you very, very much for your answer, kind regards, Gjalt-Jorn *Gjalt-Jorn Peters* | http://behaviorchange.eu Behavior change research | Health psychology Intervention development | Applied social psychology [ GG <http://greatergood.eu> OU <http://ou.nl> UM <http://maastrichtuniversity.nl> }
On 07-11-2012 16:00, Ben Bolker wrote:
<r-sig-mixed-models at ...> writes:
Hey all, This is my first post - but I assume that like at other lists, brevity is appreciated, so I have a short version and a long version:
thanks. I will answer the short version and see how far I get with the long version.
SHORT VERSION, QUESTIONS ONLY:
1) how is it possible that using lmer, none of the fixed effects has significant coefficients, yet the model with those parameters fits significantly better than a model without those parameters? Is this an example of why lmer didn\'t use to report p-values for the coefficients?
This is not really an lmer question, but a more general modeling
question. There are a few things you could mean here, but I don't
think any of them have to do with the "p-value issue", which is
more one of how to deal with the unknown distribution of the test
statistic under the null hypothesis for not-large data sets
{see http://glmm.wikidot.com/faq for more links on the p-value stuff,
and others}
* you could be asking about the difference between the results
of summary() [which uses Wald tests based on local curvature]
and anova() [which does a more precise test based on model comparison];
anova() is not perfect, but it's more accurate (and hence sometimes
different from) summary
* you could be asking about multiple predictors, none of which
is individually significant at p<0.05, but their combined effects
(i.e. comparing a model with all predictors vs. none) are significant
at p<0.05. This is not really surprising, because the joint effect
of the predictors can be stronger than any one individually. (Also,
if you're not working with a balanced, nested LMM, the effects of
the predictors can interact.)
2) what do the slash and the colon mean exactly when specifying lmer models?
A colon refers to an interaction, a slash refers to nesting (so ~a/b is equivalent to ~a+a:b, or "b nested within a"): there's more on this at the wikidot FAQ as well.
LONG VERSION WITH BACKGROUND: I am unexperienced with mixed models, but I have a dataset that has several levels that needs to be analysed - and I \'always\' wanted to learn multilevel analysis anyway, so I decided this was a good occasion. However, there are no courses at hand in the near future, so I\'m trying to get there with online resources and some books (such as \"discovering statistics using R\" by Andy Field, and in a slightly different category, the Multilevel Analysis book by Joop and the one by Snijders & Bosker. However, apparently, I lack what it takes to autodidactically learn this :-/ So I apologise, but I decided to draw on your wisdom. I\'m also kind of hoping that doing multilevel analyses is a good way of learning how to do them.
I must admit that I don\'t feel like I master the lmer model formulation, but I found a post by Harold Doran [1] where he explains the lmer syntax. My data file is structured the same as the one he models in fm3, fm4 and fm5. I have the following variables (of interest):
* cannabisUse_bi: a factor with two levels, \"0\" and \"1\". \'0\' indicates no cannabis use in the past week; \'1\' indicates cannabis use in the past week. This is the dependent variable (i.e. the criterion). * moment: a factor with two levels, \'before\' and \'after\' * id.factor: a factor with 444 levels, the identification of each participants (note that there are quite a lot of missing values, only about 276 cases without missings) * school: a factor with 8 levels, each representing the school that the participants attend * cannabisShow: a factor with 2 levels, \'control\' and \'intervention\' - this reflects whether a participant received the \'intervention\', aimed to decrease cannabis use, or not. Participants in five schools received the intervention; participants in three other schools didn\'t.
Every person provided two datapoints (one before the intervention took place, and one after); there are several persons in a school; and there are several school in each condition (level) of cannabisShow.
As far as I understand, this translates to \"Moment is nested within person (\'id.factor\'), which is nested within school, which is nested within cannabisShow\" (not sure about that last bit).
Although others on this list disagree, I don't find "nesting" to be
very useful in the context of fixed effects, because the levels of
fixed effects almost always have identical meanings across different
levels of the random effect (i.e., "before" means the same for me as
for you)
I would say the simplest sensible model would be
glmer(cannabisUse_bi ~ cannabisShow*moment + (1|school/id.factor),
family=binomial, data=dat.long)
which if your individuals are uniquely identified should be the same
as using (1|school) + (1|id.factor) as the random effects.
But I agree that you may very well want to try to take into account
whether the effects of the fixed effects differ among schools: you
might _like_ to see whether they differ among individuals as well, but
it is somewhere between impossible and very difficult to extract this
from binary data per individual (I'm sure you can't identify the
effects of cannabisShow, because each individual only gets one
intervention, and I'm pretty sure that you can't identify the effects
of before/after either, because all you have is binary data -- if you
had continuous data you *might* be able to detect variation in slope
among individuals, if it weren't confounded with residual error).
So I would try
glmer(cannabisUse_bi ~ cannabisShow*moment +
(cannabisShow*moment|school) + (1|id.factor), family=binomial,
data=dat.long)
(assuming that id.factor is unique across schools)
Now, this model doesn\'t include the effect of the intervention, and
if I include that, I get:
rep_measures.new.model <- lmer(usedCannabis_bi ~ 1 + moment * cannabisShow + (moment|school/id.factor), family=binomial(link = \"logit\"), data=dat.long);
If I compare these two models using Anova, the second one fits better (logLik from -182.02 to -166.68, ChiSq = 30.681, Df = 2, p = 2.177e-07). However, when you look at rep_measures.new.model, none of the fixed effects is significant. I may be completely wrong, but doesn\'t this mean that the cannabisShow variable, nor its interaction with measurement moment (i.e. \'time\'), contributes to explaining the dependent variable (i.e. cannabisUse_bi)?
Maybe the before/after variation among schools (moment|school) is doing a lot? Also, see my comment above about Wald tests.
(in fact, I\'m also a bit confused as to the p-values that lmer provides for the fixed effects. I thought that there were good reasons not to - and that lmer wasn\'t supposed to? [3] (I don\'t understand the post - I\'m sadly not a statistician - but I thought I got the gist) Apparently this changed . . . ?)
glmer provides likelihood ratio tests, which are good when the sample size is large. If you didn't have the school level I would say not to worry about it, but 8 schools is not a large number ...
And now that I\'m mailing anyway: what is the difference between these two models?
rep_measures.new.model.1 <- lmer(usedCannabis_bi ~ 1 + moment * cannabisShow + (moment|school/id.factor), family=binomial(link = \"logit\"), data=dat.long); rep_measures.new.model.2 <- lmer(usedCannabis_bi ~ 1 + moment * cannabisShow + (moment|id.factor:school), family=binomial(link = \"logit\"), data=dat.long);
R gives slightly (but only slightly) different coefficient estimates; but on the first one, he seems to understand that school is a level (with 8 values), where for the second one, this is apparently not specified . . . What\'s the difference between the slash and the colon for indicating levels (the levels have to be \'the other way around\', apparently?)?
The second leaves out the school effect, as specified above.
I\'m sorry to bother the list with such basic questions. I\'ve been looking for a tutorial or explanation, but I\'ve only been able to find little bits of information that I pieced together into my current (lack of ) understanding . . .
Thank you in advance! Gjalt-Jorn Peters
PS: I\'ve put the R script at http://sciencerep.org/files/7/the%20cannabis%20show%20-%20analyses.r (the part I\'m talking about now starts after the line with \"###### Behaviour\", line 195 - the real analyses I\'m talking about now start at line 314) This .R file downloads the data from http://sciencerep.org/files/7/the%20cannabis%20show%20-%20data.tsv
The output you should get is at http://sciencerep.org/files/7/the%20cannabis%20show%20-%20output.txt (but the output file is kind of hard to interpret without the analyses file, as I didn\'t \"cat\" all comments)
[1] http://tolstoy.newcastle.edu.au/R/e2/help/06/10/3345.html [2] http://www.rensenieuwenhuis.nl/r-sessions-17-generalized-multilevel-lme4/ http://www.talkstats.com/showthread.php/
14393-Need-help-with-lmer-model-specification-syntax-for-nested-mixed-model
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models