Beginner help setting up a mixed model

On Tue, Oct 12, 2010 at 3:08 PM, Christopher Desjardins
Hi Beth,

On Tue, Oct 12, 2010 at 2:03 PM, Beth Holbrook <bvholbi at yahoo.com> wrote:

I am new to both R and mixed models, so I apologize in advance about the
simplistic nature of my question.

My dataset is unbalanced and have been I have been advised by the
statistician
on my Ph.D. committee to run a mixed model. ?The response variable is
distance.
The factors are preyswim, light, and trial. ?I am interested in the effects
of
light and preyswim on distance. ?Trial is a random factor and there is also
a
crossed random effect of preyswim*trial. ?In SAS, the code would look like
this:

Proc mixed data=laketrout;
Class preyswim light trial;
Model distance = light | preyswim;
Random trial trial*preyswim;
Run;

My understanding is that to test the full ANOVA model in R, I would need to
code
three separate models and determine the most appropriate. ?This is how I
attempted to code the same model using the lme4 package in R:

options(contrasts=c(unordered="contr.SAS", ordered="contr.poly"))
mixed1 <- lmer(distance~preyswim*light + (1|trial) + (1|preyswim),
laketrout,
REML=FALSE)

I am not sure this is the model you want to run. If you want a crossed
random effect I believe you need something like the following:

mixed1 <- lmer(distance~ + (1|trial) + (1|preyswim) + (1|preyswim:light),
laketrout,REML=FALSE)

But then you no longer have any fixed effects. So I am not sure. I wonder if
you want this based on what you said above...

mixed1 <- lmer(distance~ preyswim + light + (1|trial) + (1|preyswim:light),
laketrout,REML=FALSE)

or ...

mixed1 <- lmer(distance~ preyswim + light + preyswim*light + (1|trial),
laketrout,REML=FALSE)

But that model doesn't treat preyswim*light as a crossed random effect

mixed2 <-lmer(distance~preyswim + light + (1|trial) + (1|preyswim),
laketrout,
REML=FALSE)
mixed3 <-lmer(distance~preyswim + (1|trial) + (1|preyswim), laketrout,
REML=FALSE)

My questions are:

1. ?Did I correctly set up the mixed model in R?
2. ?When determining the most appropriate model in R (I used the anova
function
to compare the three models), does it matter whether I use the AIC/BIC or
log-likelihood criterion? ?Along those lines, I'm unsure whether I need to
use
REML=TRUE or REML=FALSE in my code.

The anova function would work here because your models are nested. I believe
it's quite common to use the AIC and use the criterion of Burnham and
Anderson, 2002. Also, I believe you want REML=TRUE when comparing models
with different random effects.
Hmm - I think you have that backwards.  When comparing models with
different *fixed* effects you must use REML=FALSE.  The REML criterion
depends on the structure of the fixed-effects model matrix and you
can't easily compare values of the REML criterion for models with
different fixed-effects structure.  There may be advice to use
REML=TRUE when the random effects change but, if so, I don't know why.

My current attitude is that the usual justification for using REML
(reducing the bias in estimates of variance components) isn't as
important as many people believe, because the distribution of the
estimator can be highly skewed and why should we be overwhelmingly
interested in the mean value of a highly skewed distribution?  I
prefer to stay with maximum likelihood fits so that likelihood ratio
tests and derived criterion like AIC and BIC are well-defined.
I no longer have access to SAS although it is the preferred statistical
program
used by my committee statistician. ?I understand that there are some
limitations
to SAS anyway, particularly when calculating p-values for mixed models with
random effects.
Not sure I understand that statement.  All mixed models have random
effects.  The definition of a mixed models is that it incorporates
both fixed-effects parameters and random effects.

You mentioned p-values and I think you are reversing SAS and lme4
there.  Many people consider it to be a deficiency of the lme4 package
(which is different from R itself - you can blame SAS Institute for
everything that is in SAS but you should not blame the R Foundation
for everything that is in contributed R packages) that it does not
give p-values for fixed-effects coefficients in a model fit by lmer.
The preferred approach with lme4 is to fit the model with and without
a particular term and use a likelihood ratio test to compare the
quality of the fits.  SAS PROC MIXED, on the other hand, is happy to
provide p-values.  In fact, SAS seem to pride themselves on providing
many different p-values for the same test, even when the test wouldn't
make sense.
My preference is to use R for my statistical analyses, as
long
as I know that I'm using it correctly.

Thanks so much for your help,
-Beth

Beth Holbrook
University of Minnesota
bvholbi at yahoo.com

? ? ? ?[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

--
Christopher David Desjardins
Ph.D. student, Quantitative Methods in Education
M.S. student, Statistics
University of Minnesota

? ? ? ?[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Beginner help setting up a mixed model

Thread (5 messages)