Skip to content

lmertest F-test anova(fullm) and anova(fullm,reducedm)

4 messages · Lionel, marie devaine

#
Dear mixed-model list,

I am sorry if my questions sound trivial: I am all new to R and mixed model.

My data set is the following : I try to model scores  of primates from
different species in different conditions of a task. Each individual
repeats each condition a certain number of time ( most of the time 4 times
but with some exceptions).
I have only few individuals by specie (from 4 to 7), 3 conditions and 7
species

As dependent variables, I am mostly interested in the condition and the
Specie, but I want to correct for learning effect at the individual level
(parametric effect on repetition -'Order').

I wrote the following model (letting Subject be a random effect and 'Order'
a random slope) :
fullm = lmer(Scores ~ Condition*Specie+(1+Order|Subject))
1) Is it a sensible way to model my data?

Then, I want to test for the interaction between Species and condition. I
found two ways to do so with the lmerTest :
*computing the p-value of the F-test corresponding to Specie:Condition as
given by anova(fullm).
*constructing the reduced model without the interaction
reducedm= lmer(Scores ~ Condition+Specie+(1+Order|Subject))
and performing the Likelihood ratio test : anova(reducedm,fullm).

2) What is the conceptual difference between the two methods?

3) The numerical results are different in my case (pvalues around .05,
below in the reduced model manner, above in the F-test manner), why is it
the case? Is one better than the other one?

4) This point is not directly related to my title, but on the same data and
still on the lmerTest pasckage : the Species for now are categorical, but I
could instead take a numerical value such as the encephalization quotient
for each specie. In this case how could I evaluate the significance of the
parametric effect? lsmeans seems to care only about categorical factors.

It is very likely that I miss here very simple points, and would be very
thankful if you could help me with it.

Thank you in advance,

Marie Devaine
#
Dear Marie Devaine,

1) The way you account for the order effects is not the way I would go, 
I can see various options:
  - The effect of Order on Scores is not changing the relationships 
between your fixed effects part and the Scores, and each individuals is 
"learning" the task differently I would then use a nested random part: 
Scores~Condition*Specie+(1|Subject/Order), you would then get an 
estimation of much variation there is in the Scores between subject and 
also how much variation there is within subject between Order levels.
- Order is changing the relationship between your fixed effect part and 
the Score, ie the Condition effect on the Scores is different whether a 
primate is in its first trials or in its fourth one. You would then need 
random slopes, and then one way to go would be: 
Scores~Condition*Species+(1+Condition|Subject/Order), you would then get 
the same estimate as in the previous options plus how much the Condition 
slope vary between the Subject and within the Subject, between the 
Order. Seeing your number of levels I guess that the estimation will be 
rather tricky ...
You can see the wiki for more infomation on this: 
http://glmm.wikidot.com/faq#modelspec
I guess that your are misinterpreting the random slope part, you can see 
it as an interaction term between one fixed effect term and one random 
term, for example if you were to measure the weights of your primates 
and made the hypothesis that the weights affect the scores but that this 
effect (direction+strength ie slope) might vary between your subject 
then you would have a random slope of weight depending on the subject 
(weight|subject).

2-3) The first method identify if the interaction term explain a big 
enough portion of the total sum of square, it is a measure of how 
important is this term at explaining the variation in your data. The 
second method compare the likelihood (ie the probability to find this 
dataset with this particular set of parameter) between the model with 
and the model without the interaction term, if the removal of the 
interaction term leads to a big decline in the likelihood of the model 
then the p-value should score significant and you should keep the full 
model, in the other case the parcimony approach would lead you to choose 
the reduced model. So the difference come from the fact that the two 
methods are computing a different thing. As to which one is better this 
is a tricky question, the way I would go would be to compute confidence 
intervals around the main effect plus interaction term using bootMer for 
example and then interpreting them. You may have a look at ?pvalues for 
more options/suggestions.

As I am not familiar with lmerTest package I will not comment on your 
last question.

Hoping that I clarified some points,
Lionel
On 27/11/2014 16:03, marie devaine wrote:
#
Dear Lionel,

Thanks a lot for your input.

1) I am still not sure to get how to write things down, and I am sorry that
my description of data and model was not clear enough.
I place me in the first of the two cases that you describe, i.e. the Order
effect is a parametrical effect, Subject specific but independent of
levels. In fact, the Order variable is just a count of the number of time
the task has been performed, irrespectively of which Condition has been
performed. This is not a categorical variable and is just suppose to
capture how well the primate is learning general features about the task
(independent of Condition).
As it is, Scores~Condition*Specie+(1|Subject/Order) gives me an error since
Order values are interpreted as level, but there are as many levels as
observations by subjects.

In fact, in your example, I don't really see the difference between
(weight|subject) and (1|subject) since in both cases, the model evaluate
one coefficient by subject.


2)3) This is very clear, thank you again.

Marie


2014-11-27 18:52 GMT+01:00 Lionel <hughes.dupond at gmx.de>:

  
    
#
Dear Marie,

Ok, it makes things easier, I would then go for:
Scores~Condition*Specie+Order+(1+Order|Subject), you would then get an 
estimation of how variable is the intercept between the Subject, plus 
how variable the slope Scores vs Order between the Subject is, in this 
context having one value per subject and order will not be a problem. I 
guess the discussion between this model and the one you wrote is similar 
to the one about having an interaction term without having a main effect 
in the first place, I am not sure if it is also an issue in mixed models 
but just for safety I would then include Order as a main effect.

In my example the model with (1|subject) will estimate the variation of 
the intercept, you could actually get the estimated variation for each 
subject to the average but this is usually not so much of interest. So 
if you do ranef(model) you would get one column, one 'coefficient' per 
subject (actually these are the deviations from the overall coefficient, 
they are not coefficient per se as the model did not estimate them 
individually).
However if your model is (weight|subject), this is equivalent to 
(1+weight|subject), then you would get again the variation of the 
intercept PLUS the variation of the slope response vs weight, if you do 
ranef(model) you would then get two columns so two 'coefficient' per 
subject.

Cordialement,
Lionel
On 28/11/2014 10:51, marie devaine wrote: