lmer model for repeated measure in RCB design

[cc'ing back to r-sig-mixed-models]
Thanks Ben!

Yes, you are right, n=56. I don't know what happened there ;)

As for the ID, yes it is unique for each observation and identifies
the sampled genotype in its respective block. The ID is build as
"Genotype_Block".
Technically, I would say that ID is not technically unique for each
observation since there are three observations (fall, winter, and
spring) for each ID ... ?  (You confirm this below: "each ID is
replicated three times ...") (By "observation", I mean the smallest
sampling unit -- one row of the data frame, in long format)
Each genotype was replicated 5 times within each
block. That way I was able to sample 8 genotypes by only having 5
blocks. That means I sampled three blocks twice for the respective
genotype.
Makes sense.
Then I measured a physiological response on these genotypes in fall,
winter and spring, representing different climate conditions. I
always measured the same IDs over three different conditions (56*3).
So each ID is replicated three times in my ID column.

Also, I grouped these 7 genotypes into 3 groups since I would rather
compare the groups within each climatic condition and across the
climatic conditions instead of all the genotypes.
That makes perfect sense.
Since the ID is replicated 3 times, id is nested within genotype,
correct?

response ~ group*climate + (1|block) + (1|genotype/id)
This looks reasonable, although since id is *implicitly* nested (i.e.
it contains the genotype info) you should also be able to write it as
(1|genotype) + (1|id) .

   When you run this, lmer should report appropriate numbers of levels
in each group (block=5, genotype=8, genotype:id = 56? or 40? I'm not
sure ...) ... check these values and see that they are as you expect.

Thanks again! Stefan

-----Original Message----- From:
r-sig-mixed-models-bounces at r-project.org on behalf of Ben Bolker 
Sent: Wed 1/11/2012 12:13 PM To: r-sig-mixed-models at r-project.org 
Subject: Re: [R-sig-ME] lmer model for repeated measure in RCB
design

Schreiber, Stefan <Stefan.Schreiber at ...> writes:

Hi all,

I have a questions about the following situation and was hoping to
find clarification here.

I have a data frame with the following variables:

id, genotype, group, block, climate, response

I measured a response of 7 genotypes in a randomized complete
block design. I measured each genotype 8 times (n=48).
You have some missing combinations?  (8*7=56, right?)

I grouped my 7 genotypes into 3 for me more reasonable groups. I
measured the response on the same 7 genotypes 3 times under
different climatic conditions.

I specified block and genotype as random and group as fixed.  I
believe the proper random statement should look like: block,
genotype nested within group.

I came up with the following code:

fit1 <- lmer(weight ~ group*climate + (1|block) +
(1|group/genotype) , data=df)

The problem I have now is how can I include the fact that I
measured the same genotypes at three different times? Can I say
(1|group/genotype/id) instead of (1|group/genotype)?
Is id a unique identifier for each observation?  In that case it's
definitely redundant with the residual variance and should not be
included in the model statement.

I'm still a little bit uncertain about your experimental design 
(thanks for the careful explanation, though).  I'm going to make up 
one possible explanation.  How unbalanced is it?  Does climate 
represent another level of replication (e.g. are there three climate 
conditions that are measured for each group*genotype*block 
combination), or does it vary in an unbalanced way across 
group*genotype*block combinations?  Would your total number of
observations be 8 (blocks) * 7 (genotypes) * 3 (climate conditions)?

You shouldn't include group both as a fixed effect (your fixed
group*climate term expands to group+climate+group:climate) and a
random effect (your group/genotype term expands to 
group+group:genotype).  You should probably use (1|group:genotype)
instead (make sure group and genotype are both stored as factors).

Even if it weren't redundant, including a random effect of group
(with only three groups) is likely to give you an estimated
group-level variance of zero -- there aren't enough levels to
estimate variance reliably.

If genotypes have unique IDs then you don't need the explicit nesting
or interaction syntax.  If so, my best guess is that

weight ~ group*climate + (1|block) + (1|genotype)

is what you want.

You might consider whether it's worth including other random terms
-- the most complex model would include (group*climate|block) and
(climate|genotype) -- but you might find that you were running out of
signal ...

_______________________________________________ 
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Hi all,

I have the following mixed model for my data (Thanks Ben!):

lmer.fit<- response ~ group*climate + (1|block) + (1|genotype) + (1|id)

Here's the summary:

Linear mixed model fit by REML 
Formula: response ~ group*climate + (1|block) + (1|genotype) + (1|id) 
   Data: plc 
  AIC  BIC logLik deviance REMLdev
 1275 1325   -622     1296    1243
Random effects:
 Groups   Name        Variance Std.Dev.
 id       (Intercept)  10.25    3.20   
 clone    (Intercept)   2.75    1.66   
 rep      (Intercept)   5.34    2.31   
 Residual             148.21   12.17   
Number of obs: 168, groups: id, 56; genotype, 7; block, 5

Then I ran TukeyHSD(aov(lmer.fit)) and it gives me no error and an
output that "looks" ok. However, I am uncertain whether this is correct
to do, or not.

Here is an made up example:
d.fr<-data.frame(id=rep(1:16,3),treat1=rep(as.factor(LETTERS[1:3]),each=
16),treat2=rep(as.factor(letters[4:7]),each=4),response=rnorm(48))
fit1<-lmer(response~treat1*treat2+(1|id),data=d.fr)
TukeyHSD(aov(fit1))

I hope to get some advice on whether this is a valid thing to do.

Thanks!
Stefan