Beginner help setting up a mixed model
On Tue, Oct 12, 2010 at 3:08 PM, Christopher Desjardins
<desja004 at umn.edu> wrote:
Hi Beth, On Tue, Oct 12, 2010 at 2:03 PM, Beth Holbrook <bvholbi at yahoo.com> wrote:
I am new to both R and mixed models, so I apologize in advance about the simplistic nature of my question. My dataset is unbalanced and have been I have been advised by the statistician on my Ph.D. committee to run a mixed model. ?The response variable is distance. The factors are preyswim, light, and trial. ?I am interested in the effects of light and preyswim on distance. ?Trial is a random factor and there is also a crossed random effect of preyswim*trial. ?In SAS, the code would look like this: Proc mixed data=laketrout; Class preyswim light trial; Model distance = light | preyswim; Random trial trial*preyswim; Run; My understanding is that to test the full ANOVA model in R, I would need to code three separate models and determine the most appropriate. ?This is how I attempted to code the same model using the lme4 package in R: options(contrasts=c(unordered="contr.SAS", ordered="contr.poly")) mixed1 <- lmer(distance~preyswim*light + (1|trial) + (1|preyswim), laketrout, REML=FALSE)
I am not sure this is the model you want to run. If you want a crossed random effect I believe you need something like the following: mixed1 <- lmer(distance~ + (1|trial) + (1|preyswim) + (1|preyswim:light), laketrout,REML=FALSE) But then you no longer have any fixed effects. So I am not sure. I wonder if you want this based on what you said above... mixed1 <- lmer(distance~ preyswim + light + (1|trial) + (1|preyswim:light), laketrout,REML=FALSE) or ... mixed1 <- lmer(distance~ preyswim + light + preyswim*light + (1|trial), laketrout,REML=FALSE) But that model doesn't treat preyswim*light as a crossed random effect mixed2 <-lmer(distance~preyswim + light + (1|trial) + (1|preyswim),
laketrout, REML=FALSE) mixed3 <-lmer(distance~preyswim + (1|trial) + (1|preyswim), laketrout, REML=FALSE) My questions are: 1. ?Did I correctly set up the mixed model in R? 2. ?When determining the most appropriate model in R (I used the anova function to compare the three models), does it matter whether I use the AIC/BIC or log-likelihood criterion? ?Along those lines, I'm unsure whether I need to use REML=TRUE or REML=FALSE in my code.
The anova function would work here because your models are nested. I believe it's quite common to use the AIC and use the criterion of Burnham and Anderson, 2002. Also, I believe you want REML=TRUE when comparing models with different random effects.
Hmm - I think you have that backwards. When comparing models with different *fixed* effects you must use REML=FALSE. The REML criterion depends on the structure of the fixed-effects model matrix and you can't easily compare values of the REML criterion for models with different fixed-effects structure. There may be advice to use REML=TRUE when the random effects change but, if so, I don't know why. My current attitude is that the usual justification for using REML (reducing the bias in estimates of variance components) isn't as important as many people believe, because the distribution of the estimator can be highly skewed and why should we be overwhelmingly interested in the mean value of a highly skewed distribution? I prefer to stay with maximum likelihood fits so that likelihood ratio tests and derived criterion like AIC and BIC are well-defined.
I no longer have access to SAS although it is the preferred statistical program used by my committee statistician. ?I understand that there are some limitations to SAS anyway, particularly when calculating p-values for mixed models with random effects.
Not sure I understand that statement. All mixed models have random effects. The definition of a mixed models is that it incorporates both fixed-effects parameters and random effects. You mentioned p-values and I think you are reversing SAS and lme4 there. Many people consider it to be a deficiency of the lme4 package (which is different from R itself - you can blame SAS Institute for everything that is in SAS but you should not blame the R Foundation for everything that is in contributed R packages) that it does not give p-values for fixed-effects coefficients in a model fit by lmer. The preferred approach with lme4 is to fit the model with and without a particular term and use a likelihood ratio test to compare the quality of the fits. SAS PROC MIXED, on the other hand, is happy to provide p-values. In fact, SAS seem to pride themselves on providing many different p-values for the same test, even when the test wouldn't make sense.
My preference is to use R for my statistical analyses, as long as I know that I'm using it correctly.
Thanks so much for your help, -Beth Beth Holbrook University of Minnesota bvholbi at yahoo.com ? ? ? ?[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Christopher David Desjardins Ph.D. student, Quantitative Methods in Education M.S. student, Statistics University of Minnesota ? ? ? ?[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models