submitted for list review: ezBuildME() - R-SIG-mixed-models

Sun, Sep 5, 2010 5:02 PM #

Hi folks,

Hot on the heels of my release of ez v2.0 last week, I have version
2.1 nearly ready to go. Amongst minor bug fixes, I'm toying with
adding a function to automate the process of building up a mixed
effects model for very simple designs: one random effect (participants
in an experiment) and any number of factorized fixed effects. I
imagine this being used in factorial experiments where people have a
priori interest in all main effects and interactions between the fixed
effects, so this automates the process of building and comparing all
pertinent models.

As a neophyte to mixed effects modelling, I thought I'd check with the
list that the function's operation makes sense
statistically/philosophically. The code & documentation are
downloadable here (you'll need to load the plyr and lme4 packages to
use it):

http://rfecs.me/wp-content/uploads/2010/09/ezBuildME.zip

And here's a brief description of the operation:

This function is used to compute sequential comparisons of nested
mixed effects models, testing each possible effect against a model
that contains all effects at levels of interaction lower than that
effect. For example:

- a test of a main effect of a predictor compares a model containing
the main effect of the predictor plus the random effect (specified by
\code{wid}) against a model containing simply the random effect. eg:
    dv ~ v1 + (1|wid)
        versus
    dv ~ (1|wid)

- a test of a 2-way interaction compares a model containing the 2-way
interaction plus the main effects of all predictors plus the random
effect against a model with just the main effects and the random
effect. eg:
    dv ~ v1:v2 + v1 + v2 + v3 + (1|wid)
        versus
    dv ~ v1 + v2 + v3 + (1|wid)

- a test of a 3-way interaction compares a model containing the 3-way
interaction plus all 2-way interactions plus all main effects plus the
random effect against a model with all 2-way effects, all main
effects, and the random effect. eg:
    dv ~ v1:v2:v3 + v1:v2 + v1:v3 + v2:v3 + v1 + v2 + v3 + (1|wid)
        versus
    dv ~  v1:v2 + v1:v3 + v2:v3 + v1 + v2 + v3 + (1|wid)

- etc.

Thoughts?


--
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

Hadley Wickham

Sun, Sep 5, 2010 5:58 PM #

It's a small point, but this comparison is easier to understand if expressed as:

dv ~ (v1 + v2 + v3) ^ 3 + (1 | wid)
vs
dv ~ (v1 + v2 + v3) ^ 2 + (1 | wid)

Hadley

Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

Reinhold Kliegl

Mon, Sep 6, 2010 6:05 AM #

Three comments (which you probably already considered anyway, but it
was not clear from the post):

(1) In general, I would recommend to implement the sequence
drop1()-like, that is start with the full model and check whether
dropping the highest-order interaction significantly reduces the GOF,
and so on. (Of course, in perfectly balanced design it does not
matter, but we rarely have the data in this shape.) I was not sure
whether you want to advocate it as a way to arrive at a minimal model.
If so, then the drop1() approach makes sure that you do not
accidentally delete low-order interactions before you test the
high-order one.

(2) Do you plan some branching for separate tests of main effects or
of interactions of the same order rather than an omnibus test for
removing all main effects or 2-factor or 3-factor interactions? Often,
we expect only one of the the interactions to be significant. In other
words, suppose you have factors A, B, and C. Do you plan to test the
joint effect of A:B, A:C, and B:C or do you perform the tests for each
of the three interactions separately?

(3) There are quite a few side conditions for whether or not the LRT
statistics are conservative or anti-conservative (e.g., Pinheiro &
Bates, 2000). So probably there should be a big "Use at or your own
risk!" message displayed up front. (In my experience, data from
typical psychological experiments with RT as DV are usually fine in
this respect--or at least I have not seen evidence to the contrary.)

Reinhold Kliegl

PS: Many psychologists will love you for this LRT script as a
substitute for their favorite omnibus ANOVA F-test. Fortunately, they
still will have to think about planned comparisons to make sense of
the coefficients for factors.