Skip to content
Prev 5855 / 20628 Next

lmer model specification problem

Ram H. Sharma <sharma.ram.h at ...> writes:
It is going to be hard to estimate the random-effects variance with
just three samples.  You will most likely have to estimate year as 
a fixed effect and accept that you will not be able to generalize across
years reliably.
I don't know what you mean by 1:2 here.  You say "just three are
shown as example", yet you only show two.  It looks from this as though
you only have two villages, in which case the comment above applies
(but even more strongly because you have only 2 rather than 3 levels).
This makes sense.  (Perhaps the comment about "just three are shown
as example" was accidentally copied to 'village' above?)
I think you are confused (as is quite common) about nesting and
crossing.  Let's assume for the moment that you don't have enough
data to estimate the interaction between "variety" and village/farm
(i.e. varieties are not grown in enough different villages and farms
to estimate whether they have variable yields across villages and
farms.

  gryld ~ year+village+year:village+(1|farm:village:year)+variety

seems reasonable.  

  I have left out variety:year + variety:village + variety:year:village,
which you included in your model (see why, below).

A lot will depend on how much data you have.
If you have about 6 varieties per farm, 9 farms per village, 2 villages,
3 years, for a total of about 324 data points (from above it looks
like you may have either 5 or 6 varieties per farm per year), then
you will be limited to estimating approximately 15 to 30 parameters
(1 parameter per 10-20 data points). This means you will have to
think carefully about how to restrict your model.  In principle you
could say

  gryld ~ (year+village+variety)^3 + (year+village+variety|farm:village)

to find *all* of the interactions, but this will be far more than your
data can support.  The first model I suggested above has

 1 (intercept) + 2 (year) + 1 (village) + 2 (village:year) +
9 (variety) + 1 (farm:village) = 16 parameters.  If something
like the interaction of variety by year or variety by village is
very important to you, you could attempt to put it in, but you 
probably have to choose one or the other (variety:year = 18 additional
parameters, variety:village = 9 additional parameters).  variety:year:village
would add an additional 18 parameters on top of this.  Trying to fit 
a model with 61 parameters to a data set with 324 data points is
not going to work very well.

  Do not be tempted to throw everything in and use stepwise
approaches to discard terms that appear non-significant.