Skip to content
Prev 295305 / 398506 Next

Syntax for lme function to model random factors and interactions

See inline
On Mon, May 21, 2012 at 11:17 AM, i_like_macs <dkoya at mac.com> wrote:
so in this case x1 ... xn are random effects where the effects are
allowed to vary across levels of g1 .. gm.
that is correct, the left side is the random effect (random intercepts
and/or random slopes), the right side is whatever variable codes the
levels that the random effect can vary across.
so this would indicate that C and D are random effects, but you will
need something to indicate what they get to vary across.
I do not think that is what you want (and besides I do not think that
lme() allows multiple random arguments though I could be wrong because
I work with lmer more than lme).
yes, ~ 1 | C  means that there is a random intercept for each level of
C.  If you do this, you will get an estimate of the average intercept
for the intercept in your model, but you will also get an estimate of
the variance in intercepts (technically the intercepts are assumed to
come from a normal distribution, you will get the mean and variance
(or standard deviation) of the maximum likelihood estimate of that
distribution).
Understanding more theory would probably help.  The Pinheiro and Bates
text as wonderful as it is, may not be the easiest place to start.  I
have not seen you mention anything about a grouping or nesting
structure in your data.  This may be part of the confusion too.  Many
uses of mixed models are for that case.  A classical example would be
students nested within classrooms.  In that case, the research
question could be does number of hours spent on homework predict
grades.  The model could look like:

grades ~ 1 + homework

or in ordinary regression notation instead of R's formula:

grades = Xb + e

where X is a design matrix the first few rows of which might look like:

1  2
1  2.5
1  3
1  2

the first column being the intercept adn the second the number of
hours spent on homework for each student.  b will be a vector of
coefficients, the first coefficient being the estimated intercept and
the second the slope of grades on homework.  e is a vector of
residuals, that part of grades which cannot be explained by the
intercept and homework.  The assumption in ordinary regression is that
e is identically and independently distributed, but students are
within classrooms, and we might guess that in fact, each student was
not really an indepedent observation---there is some similarity
because they share a classroom.  Mixed models address this by adding
random effects.  Following the above example, we might do:

grades ~ 1 + homework
random = ~ 1 | ClassroomID

this allows the intercepts to randomly vary by classroom, which is
sensible---some classes may have more or less skilled students so
given that everyone did 0 hours of homework, we still might expect
some classrooms to have higher or lower grades.  This models that.
Now lets say that further, you think that the effects of homework
might vary across classrooms.  Perhaps for students in very low
performing classes, they get an enormous benefit from spending time on
homework, whereas in the very high performing classes, their grades
only marginally improve for every additional hour of homework.  You
could then write this as:

grades ~ 1 + homework
random = ~ 1 + homework | ClassroomID

Your data may not be like that, but that (or something along those
lines) is very common and probably what you will see many many
examples for.  It is not clear to me what A, B, C, and D represent for
you, so it is hard to be very specific about what you should or should
not be doing (and that is where knowing your data and the theory or
consulting with a local statistician can be very helpful).
I would advocate learning the theory and code hand in hand.  I do not
know of any good introductory texts that would walk through this
though teaching mixed models and R (does not mean they do not exist,
just that I do not know them).  As I said earlier, I do like the
Pinheiro and Bates book, but I would not give it to a social sciences
graduate student or someone with minimal mathematical and statistical
background.  Are there any resources at your university? (are you at a
university?)

Cheers,

Josh