Understanding and analysing split split design using lmer()

Rabin KC

Mon, Sep 9, 2019 4:05 PM

Dear Peter,

Your reply has made my day and very clearly answered the questions I was
asking myself for a few weeks. Thank you for correcting my mistakes as
well.

The lmer() does produce a variance of 0 for random effect (crop: rep). I
now know the term is called boundary (singular) fit. What would be the
correct thing to do in this case? I do have this experiment conducted for 2
years in 2 different locations, which might take care of the precious
degrees of freedom, but again, will add new variables making the model more
complicated.

I ran an Anova II (wald test) and was thinking of doing post hoc analysis
on significant terms. (If this would take care of singularity). Any advice
on that would be greatly appreciated.

And yes, I am in Agronomy in U of M. I will be writing to you in your work
email about the friendly statistician you mentioned!!!!


Best regards,
Rabin

On Mon, Sep 9, 2019 at 4:43 PM Peter Claussen <dakotajudo at mac.com> wrote:


On Sep 9, 2019, at 1:35 PM, Rabin KC <kc000001 at umn.edu> wrote:

Hello R mixed models community,

?



model1<aov(cover_coverage~rep+crop*tillage*cover+Error(rep/crop/tillage),data=data)

For my  analysis using lmer, and lme,  I have the following code:

model2<-lmer(biomass~crop*tillage*cover+ (1|rep),data=data)

model3<-lme(fixed=biomass~crop*tillage*cover, data=data, random=~1|rep)

Now my confusion and questions are:
1. I have used crop, tillage, and cover as fixed effects. And only the rep
(block) as random. Is the random effect properly assigned for this design?


Simply, no.

When you have three levels of independent randomization, then your design
model has three random effects

Consider the expected output from agricolae. There should be an error term
for crop, as randomized within whole blocks, designated Ea. This is the
crop x replicate interaction term, and would also be the residual error if
you were to analyze these data as a simple RCB of 2 treatments and 4
replicates (averaging biomass over whole plots). The F-test you would use
to test the significance of crop would be MS(crop) / Ea, while the F-test
for replicate effects would be MS(rep)/Ea.

There will be a second error strata, representing the independent
randomization of tillage within whole plots (replicate:crop). That should
be labelled Eb. This would also be the residual error of a simple
split-plot experiment with crop as whole plot and tillage as subplot
treatments (averaging biomass over subplots) and would be represented as
rep:crop:tillage. This is the error term to test tillage and crop-tillage
interaction.

The final error strata is the independent randomization of cover within
crop:tillage subplots, and this will work out to be equivalent to residual
error. So, to duplicate the decomposition of the aov from agricolae, we
might write the linear model as

lm(biomass ~ crop*tillage*cover + rep +  rep:crop + rep:crop:tillage,
data=data)
or, more briefly, as
lm(biomass ~ crop*tillage*cover + rep/crop/tillage,data=data)

That won?t give the correct F-tests, though, so to get aov to perform
appropriate tests, we should write

aov(biomass ~ crop*tillage*cover + rep + Error(rep:crop/tillage),
data=data)

(not what you?ve written, since we test rep against rep:crop)

This leads to an lmer model of
lmer(biomass ~ crop*tillage*cover+(1|rep/crop/tillage), data=data)


2. I have read a lot about crossed and nested design. Crawley and Oehlert
give particular examples about it, but I have trouble understanding if this
design has nested factors or crossed factors.


crop, tillage and cover are all crossed factors, since each possible
combination of crop, tillage and cover are included, and you might be
interested in the interactions among these factors. Since this is a
split-split-plot experiment, the experimental units (rep:crop = whole plot,
rep:crop:tillage=subplot, rep:crop:tillage:cover = sub subplot) are nested,
and not all combinations of treatments are subject to the same experimental
errors.

That?s where the analysis starts to become tricky. If you want to compare
treatments that are all applied within same random effects, that is,

Crop A : Tillage A : Cover A vs Crop A : Tillage A : Cover B,

then the error terms for mean comparisons are simple. Comparisons across
strata, like

Crop A : Tillage A : Cover A vs Crop B : Tillage A : Cover A

require estimates of the variances for each level of error, and with only
a few observations (Ea (crop:rep) should have only 3 d.f.) you might not
get estimates from lmer, and error terms derived from aov might include
negative variances. AOV of the linear model (lm) is useful here, in that if
any of the F-ratios of the random effects (rep, rep:crop or
rep:crop:tillage) are less than 1, then you should expect lmer to produce a
variance estimate of effectively 0. I would need to write out the expected
means squares for a split-split-plot, though, to justify this statement.

By your email address and the subject matter, I?m guessing you?re in
agronomy at the U of M. If so, I can suggest a friendly neighborhood
statistician to consult.

Cheers,

Peter Claussen
(work email Peter at gdmdata.com)

Understanding and analysing split split design using lmer()

Thread (10 messages)