Today is apparently the day for me to follow up on zombie threads: While Alex is right that mixed-effects models are difficult with small sample sizes, there seems to be an implicit assumption that a large number of grouping levels and thus a large number of random intercepts increases the number of model parameters. This isn't correct. The variance of the intercepts is a model parameter, not the intercepts themselves -- the individual intercepts are BLUPs / predictions / conditional models and not estimates / model parameters. This is the difference to including subject-id as a covariate in a classical fixed-effects regression. Indeed, increasing the number of groups generally helps in mixed-effects regression because variance estimates are quite sensitive to small sample sizes. (@Ben Bolker and others if I've messed this up, please correct me!) Your sample size is still nonetheless quite small if you want to model e.g. the variance for playback type (which I think could be important in your model). Using rmANOVA isn't really better than a mixed-effects model here, because rmANOVA *is* an intercepts-only mixed-effects model with additional symmetry assumptions (sphericity)! Compare the output of
summary(aov(Reaction ~ Days + Error(Subject),sleepstudy))
Error: Subject
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 17 250618 14742
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
Days 1 162703 162703 169.4 <2e-16 ***
Residuals 161 154634 960
and
anova(lmer(Reaction ~ Days + (1|Subject),sleepstudy))
Analysis of Variance Table
Df Sum Sq Mean Sq F value
Days 1 162703 162703 169.4
You can get the residual Mean Sq F in the mixed model by looking at the
variance column of the Residual grouping under Random effects in
summary()-output.
The advantage to mixed-effects models in this case is that you're doing
the regression explicitly.
Best,
Phillip
On 06/13/2016 01:43 AM, Sam Hardman [sah74] wrote:
Hi Alex,
The reason I was looking at using mixed models is because I have
repeated measures for indivuals (i.e. each bird was tested three times
with three different playback treatments). As far as I know ANOVA can't
control for this. I am wrong about this?
You are correct that I have 40 animals in the experiment, with three
treatments each that makes 120 measurements per response behaviour. I
agree with you that the sample size is small which could cause problems
with reliability of results. If ANOVA could work then I will certainly
look at that. How would I control for repeated measures?
Thanks for your help, I appreciate it.
Sam
------------------------------------------------------------------------
*Von:* Alex Fine <abfine at gmail.com>
*Gesendet:* Sonntag, 12. Juni 2016 23:10:59
*An:* Phillip Alday
*Cc:* Sam Hardman [sah74]; r-sig-mixed-models at r-project.org
*Betreff:* Re: [R-sig-ME] P-values from interaction terms using lme4
Hey Sam,
I actually think mixed effects regression might be inappropriate in this
case. Do I have it right that you have 40 total animals in the
experiment? You said you had 1 bird in the city center and 1 outside
the city in 20 cities.
If that's right, then the model you described (to take one of your
examples) would be:
Approach ~ PlayBack * UrbanRural + (1|ID)
That's 3 * 2 - 1 = 5 coefficients in your model, plus a random intercept
for 20 different cities. I don't think you have enough data to fit such
a model and trust the results.
If you have a balanced design and you have clear a priori predictions
about each of the contrasts in your design, I'd recommend just using
ANOVA. I don't think you gain very much from using MEMs in this case.
You could use vanilla linear regression, but setting up the coding
schemes for your two predictors is going to be so complicated that it'll
actually be much simpler to just do ANOVA + t-tests.
Alex
On Sun, Jun 12, 2016 at 4:31 AM, Phillip Alday
<Phillip.Alday at unisa.edu.au <mailto:Phillip.Alday at unisa.edu.au>> wrote:
Hi Sam,
if you're getting p-values from lmer outside of a likelihood-ratio test,
then you're using lmerTest, not lme4. lmerTest is designed to be a
drop-in replacement for lme4, but it does bring some extra
features/'complications' in the form of the denominator degrees of
freedom.
There are two easy options for getting ANOVA-style p-values:
1. Using lmerTest, you can just do
anova(mod1,ddf="Satterthwaite",type=2)
If you're using vanilla lme4, then you need to do this first:
library(lmerTest)
mod1 <- as(mod1,"merModLmerTest")
which will cast your model to an lmerTest model.
2. Using the car package:
library(car)
Anova(mod1,test.statistic="F",type=2)
If you fitted your model with maximum likelihood, i.e. with REML=FALSE,
then you need to refit your model using REML first:
mod1 <- update(mod1,REML=TRUE)
For both options, you need to specify the types of test you're doing. I
highly recommend Type 2, but there is a lot of material on that debate,
search for e.g. Venables' "Exegeses on linear models", or look at the
documentation for car::Anova(). Please note that the distinction for
Type 2 vs Type 3 is *very* relevant for your question since you're
concerned about interaction terms.
For the lmerTest route, you can specify the approximation to use for the
denominator degrees of freedom: Satterthwaite is much faster, but
Kenward-Roger is more accurate. For car::Anova(), the F-statistic is
always computed with Kenward-Roger (and only works for REML-fitted
models for reasons that I can't explain quickly), but you have the
option of using a Chisq test statistic, which is equivalent to assuming
that the denominator degrees of freedom are infinite, or equivalently,
that your t-values for the coefficients are z-values.
These ANOVA-style tests are Wald tests and are asymptotically equivalent
to the LR-tests, but are less conservative for finite samples.
Now, since you care about particular contrasts within the model, you may
also just want to look at your model coefficients. Depending on which
coding scheme you're using, the contrasts represented by your model
coefficients might not be the ones you want, but packages like lsmeans
(a regular mention on the list here and very well documented) can
compute all types of contrasts post-hoc. Rereading your question, this
may be the best way to go for your target conclusion/result statements.
Best,
Phillip
On Sat, 2016-06-11 at 16:41 +0000, Sam Hardman [sah74] wrote:
> Dear all,
>
>
> I have some data which I would like to analyse using lme4 and I
would really appreciate some help deciding what the best method is.
>
>
> My experiment is as follows:
>
>
> I tested the responses of urban and rural great tits to playbacks
of great tit song from a loud speaker within their territories.
>
>
> I created three playback song types:
>
> -undegraded
>
> -degraded
>
> -very degraded
>
>
> I played these to birds in 20 different cities. In each city I
tested one bird in the city centre and one bird in a rural location
outside of the city (i.e. paired samples). Each bird received all
three playback types.
>
>
> I measured five different responses to these playbacks:
>
> -Time to sing back to playback (in seconds)
>
> -Time to sing back to playback (in seconds)
>
> -Time the bird spent within five metres of the speaker (in seconds)
>
> -Number of times the bird flew over the speaker (count)
>
> -The closest approach the bird made to the speaker (in metres)
>
>
> For each of these five repsonses I would like to know if there is
an interaction between habitat and playback type.
>
>
> So, I have a model which looks like this:
>
>
> mod1<-lmer(repsonse ~ Playback*UR + (1|ID))
>
>
> Where response is one of the five repsonse behaviours, playback is
the playback type, UR is habitat type (urban or rural) and ID is the
ID of the bird.
>
>
> This gives me results and P-Values but am not sure these P-values
are valid and I think I should compare this model to a null model to
get a valid P-values.
>
>
> So I can use a likelihood ratio tests to test for differences in
response by habitat type alone:
>
>
> mod1<-lmer(approach ~ UR + (1|Location))
> mod2<-lmer(approach ~ 1 + (1|Location))
> anova(mod1, mod2)
>
> or for differences in response according tom playback type alone:
>
> mod1<-lmer(approach ~ Playback + (1|Location))
> mod2<-lmer(approach ~ 1 + (1|Location))
> anova(mod1, mod2)
>
> But how should I do this when there is an interaction term? I
deally I would like P-values for each playback type in interaction
with habitat. e.g.
>
> Undregraded playback * Habitat (urban/rural
> Degraded playback * Habitat (urban/rural)
> Very degraded playback * Habitat (urban/rural)
>
> This would allow me to say, for example, something like "urban
birds approached the speaker more closely than rural birds in
response to undegraded playbacks". I would like to do this with each
of the five response behaviours.
>
> I would really appreciate any suggestions for the best way forward
with this and apologies if this question is too simple for this group.
>
> Best wishes,
> Sam
>
>
> --------------------------------------------------------------------
> Aberystwyth - Prifysgol Gyntaf Cymru https://www.aber.ac.uk/cy/
>
> Aberystwyth - Wales' First University https://www.aber.ac.uk/en/
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list
_______________________________________________
R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Alex Fine
Ph. (336) 302-3251
web: http://internal.psychology.illinois.edu/~abfine/
<http://internal.psychology.illinois.edu/~abfine/AlexFineHome.html>
--------------------------------------------------------------------
Aberystwyth - Prifysgol Gyntaf Cymru https://www.aber.ac.uk/cy/
Aberystwyth ? Wales? First University https://www.aber.ac.uk/en/