Help establishing mixed model equation for split, plot design
Hi Felipe, please also respond to the mailing list. Sorry, lme only takes nested random effects (see https://stats.stackexchange.com/questions/228800/crossed-vs-nested-random-effects-how-do-they-differ-and-how-are-they-specified). In order to use the rep:salinity effect, you could add a column for the main plots in your data set: sp.datos$rep_salinityF <- paste(sp.datos$rep,sp.datos$salinityF,paste="_") lmemodel <- lme(Plantsurvival ~ salinityF*immersionF*SpecTF, random= ~1|rep_salinityF, data=sp.datos) I hope, it works like this. I'm afraid, I don't fully understand your question regarding equal number of samples. Do you mean missing observations? In lme, you need to add na.action="na.omit", while lmer is automatically omitting missing observations. You're right, that the ANOVA is doing type III SS. If you're interested in other SS type, you could look at the Anova() function from the car package. Apparently it can also deal with mixed models. Furthermore, the new version of lmerTest can also handle other SS type. See https://github.com/runehaubo/lmerTestR/blob/master/pkg_notes/new_lmerTest.pdf Best, Samuel
On 04/04/18 10:15, CALLEJA APESTEGUI, FELIPE FRANCISCO wrote:
Hi Samuel, first of all thanks a lot for the quick and clear response. As you state, all models give the same results once they are established following your examples. Now I have a clear view about what is really doing the command and how to continue. Just one quick comment. The line: *lmemodel <- lme(Plantsurvival ~ salinityF*immersionF*SpecTF, random= ~**1|rep:salinityF, data=sp.datos)* Throws the following error: *Error in getGroups.data.frame(dataMix, groups) : **invalid formula for groups* I leave it here in case is interesting for you. I've decided to use the lmer package because I have other variables that are not balanced, so I prefer to use the same model in all cases, and that package is working fine. Just one final question. Is there any consideration I should have when working with variables without equal number of samples? I mean, something specific in the command of the lmer I should be careful with or investigate further on. Or does the function itself has all considerations included. i see from the *anova(fit.okay3, ddf="Kenward-Roger") *that the anova is made using the type III SS so I wouldn't have to change that part. * * Also, fyi,? the aov function in the case of not grouping by rep throws the same result as the lmer. Again, I really appreciate your help, it has been a lifesaver. Best regards, *Felipe Calleja Ap?stegui* *Predoctoral researcher* * * *Instituto de Hidr?ulica Ambiental "IH Cantabria"* C/ Isabel Torres, N? 15 Parque Cient?fico y Tecnol?gico de Cantabria 39011 Santander (Espa?a) www.ihcantabria.es <http://www.ihcantabria.es/> Tel:? +34 942 20 16 16 Ext. 1153 Fax: +34 942 26 63 61 e-mail: f*elipe-francisco.calleja at alumnos.unican.es* ------------------------------------------------------------------------ *De:* Samuel Knapp <samuel.knapp at tum.de> *Enviado:* martes, 3 de abril de 2018 21:26:31 *Para:* r-sig-mixed-models at r-project.org; CALLEJA APESTEGUI, FELIPE FRANCISCO *Asunto:* Re: Help establishing mixed model equation for split, plot design Hi Felipe, if I understand the design correctly, it is a split-plot design with salinityF as the whole-plot. Only, if you have arranged the boxes in a way that replicate blocks are formed, you should include the rep effect in a model. These replicate blocks could be, that you always have three boxes with the three different salinity levels grouped together, e.g. on one shelf. If you have simply replicated each box 5 times and they are set up fully randomly, you should not include a repliate block. If you have no missing values, and you specify the right model, both aov() and lme() should give you the same results. I would suggest the following models: # with replicate blocks aovmodel <- aov(Plantsurvival ~ salinityF*immersionF*SpecTF + Error(rep/salinityF), data=sp.datos) lmemodel <- lme(Plantsurvival ~ salinityF*immersionF*SpecTF, random= ~1|rep/salinityF, data=sp.datos) # or with lmer from lme4 package (library (lme4) lmermodel <- lmer(Plantsurvival ~ salinityF*immersionF*SpecTF +(1|rep/salinityF), data=sp.datos) # without replicate blocks (I'm not completely sure, if aov will report the exact same results here) aovmodel <- aov(Plantsurvival ~ salinityF*immersionF*SpecTF + Error(rep:salinityF), data=sp.datos) lmemodel <- lme(Plantsurvival ~ salinityF*immersionF*SpecTF, random= ~1|rep:salinityF, data=sp.datos) lmermodel <- lmer(Plantsurvival ~ salinityF*immersionF*SpecTF +(1|rep:salinityF), data=sp.datos) As rep/salinityF is simply a short form for rep+rep:salinityF the difference will be that you have no rep block effect in the models without replicate blocks. If you have no missing values, the point about using a mixed model here is less about variance estimation through a random effect, but automatically getting the right F-Test of the main-plot effect in ANOVA (which is the main part of doing a proper split-plot analysis!!!). My experience is that the anova() function from the lmerTest package will give you the right denominator df when using the Kenward-Roger method. You need to use lmer for this: library(lmerTest) anova(lmermodel,ddf="Kenward-Roger") For proper mean comparisons, I suggest to use the emmeans package, with the Tukey test and letter display from the cld() function: library(emmeans) cld(emmeans(lmermodel,"salinityF") #replace salinityF by the factor you are interested Finally, if you have one or more missing values, aov() will return strange anova tables, thus better use a mixed model! Best regards, Samuel -- Samuel Knapp Lehrstuhl f?r Pflanzenern?hrung Technische Universit?t M?nchen (Chair of Plant Nutrition Technical University of Munich) Emil-Ramann-Strasse 2 D-85354 Freising Tel. +49 8161 71-3578 samuel.knapp at tum.de www.researchgate.net/profile/Samuel_Knapp <http://www.researchgate.net/profile/Samuel_Knapp> On 03/04/18 19:16, r-sig-mixed-models-request at r-project.org wrote:
Message: 1 Date: Tue, 3 Apr 2018 07:53:15 +0000 From: "CALLEJA APESTEGUI, FELIPE FRANCISCO" <felipe-francisco.calleja at alumnos.unican.es> To: "r-sig-mixed-models at r-project.org" ??????? <r-sig-mixed-models at r-project.org> Subject: [R-sig-ME] Help establishing mixed model equation for split ??????? plot design Message-ID:
<DB6P191MB0088D09AF911107B33C1DE5C87A50 at DB6P191MB0088.EURP191.PROD.OUTLOOK.COM>
Content-Type: text/plain; charset="iso-8859-1" Hello, I'm looking for some help establishing a mixed model ANOVA using R,
for a split plot design I've made for an experiment of saltmarsh germination. I'll explain as clear as possible the experimental design and afterwards what I've done and my doubts. Hope you can help me. I haven't worked very much with R so many of my doubts are about what I'm "telling" it to do with one or other command.
The experiment consists in meassuring the germination percentage of
one species in variable conditions of salinity, immersion time and presence of other species. I sow seeds in soil core's, that are inside plastic boxes that are filled periodically with saline water. There are three factors: salinity (3 levels: 0, 5 and 18), immersion time (3 levels: 0, 20, 40%), species treatment (2 levels: Baccharis, Baccharis + Juncus). Inside each box there are 6 cores that combine in a complete random design the factors of immersion and species treatment. The box is filled with water at one of the levels of salinity. Thus, I'm using a split plot design with salinity as the whole plot factor, and immersion and species treatment as the subplot factors. All factors are considered fixed. Each box is repeated 5 times. Thus, there are 15 boxes and 90 soil cores. The dependent variable is the percentage of germination of the species Baccharis in each core.
I have several doubts about how to analyze the array. 1 - As far as I understand, although I'm treating all my factors as
fixed, this is a mixed model because of the interaction between subjects (the boxes I believe), and the "split plot nature" of the array, right? In that sense, which function would be better to analyze this, the aov of the stats package, the lme of the nlme package, the lmer of lme4?
?? 2 - I've had trouble calculating the degrees of freedom for the
residuals. The only reference I have is that the error of the whole plot part should have 12 df's, and the within error should have 60 df's. With that reference I've established two possible R commands:
fit.aov2 <- aov(Plantsurvival ~ salinityF*immersionF*SpecTF +
Error(rep:salinityF/immersionF:SpecTF), data=sp.datos)
With rep being the number of repetition. This last one gives the 12
and 60 df's for the error terms.
The other option is: fit.okay <- lme(Plantsurvival ~ salinityF*immersionF*SpecTF, random=
~1|rep/salinityF, data=sp.datos)
But in this last case, the df's are 8 and 60, which makes me suspect
maybe there is something wrong. But as I said, I haven't cleared my head on which should be the correct df's.
Questions: Is the aov line solving a mixed model adequate for my
design?,
?????????????????????? Is the lme line considering salinityF as a
random factor? If it is, how can I tell it to consider all factors as fixed, but put salinity at the "higher level" of the whole plot and the other ones in the "lower level" of the subplot?
I hope the questions and the experimental array are clear. If there
is any doubt or need more information please let me know. I attach a csv file with the data in case you want to see it.
Finally, if is not too much to ask, I'm fairly new to the splitplot
anova's and R, so I would really appreciate if you could answer be with as much detail as possible, to fully understand what's going on and where to continue.
Thanks a lot, Felipe Calleja Ap?stegui Predoctoral researcher Instituto de Hidr?ulica Ambiental "IH Cantabria" C/ Isabel Torres, N? 15 Parque Cient?fico y Tecnol?gico de Cantabria 39011 Santander (Espa?a) www.ihcantabria.es<http://www.ihcantabria.es/> Tel:? +34 942 20 16 16 Ext. 1153 Fax: +34 942 26 63 61 e-mail: felipe-francisco.calleja at alumnos.unican.es