df pseudoreplication in lme model
Lauren Meyer <lauren.meyer90 at ...> writes:
Hello, I am trying to assess whether or not my df are pseudoreplicated in my lme model. my study was undertaken on five fish (labeled PC) each tested in two replicates(REP), across each combination of three treatments HOM, C18 and CU, each of which had two levels; HOM(SON, BLD),C18 SML, BIG), CU (YES, NO). The variable we are assessing is the amount of toxin extracted (TOX1). Also, some data is missing, and has already been removed. I am using an lme model, as the study design is similar to a split plot design, with a 2X2X2 full factorial design. There are a total of 65 observations. Here is the model I am using:
model<- lme(TOX1~HOM*C18*CU, random=~1|PC/REP, data=Data4, method="ML")
Linear mixed-effects model fit by maximum likelihood which results in 48 DF for everything. Furthermore, I removed the three way interaction as well as all of the two way interactions as they were deemed non-significant, producing the final model :
model5<- lme(TOX1~HOM+C18+CU, random=~1|PC/REP, data=Data4, method="ML")
which has 52 DF However, I am unsure if these Df are pseudoreplicated and would like some help in how to determine if this is the case. I am happy to upload the full dataset and/or any of the outputs if that would help.
Not sure entirely what you mean by "pseudoreplicated df". I guess there are quite a few missing observations (since 5 x 2 x 2 x 2 x 2 = 80). In principle since this is a randomized block design (you have the treatments replicated within every fish*rep combination), the df here should be correct (you can look up the formula for the df of a randomized block design in a general stats book, e.g. Ellison and Gotelli _Primer of Ecological Statistics_). There is one potential issue here, though: technically, since you measured all treatments in every fish, you have the capability to measure whether the treatments vary across fish and across replicates (random = ~HOM+C18+CU|PC/REP). However, 5 fish is not very many reps, especially not for estimating a full 3x3 variance-covariance matrix for the treatments ...) Schielzeth and Forstmeier Behav Ecol 20:416?420 (2009) talk about the importance of accounting for among-individual variation in effects, but caution:
There are a few potential problems when using random slope
models. First, if there are only few individuals, the between-individual variance components are difficult to estimate and tend to be underestimated. This leads to unstable and often slightly overconfident SEs. Second, random slope models might not converge, particularly if more than one random intercept and one random slope are included. The number of parameters to be estimated increases substantially because not only the random effect for the intercepts and slopes but also the correlations among them have to be estimated. In case of convergence problems, we suggest following Figure 1 to judge if including random slopes is likely to have a large influence and to run preliminary submodels to decide whether or not to include particular random slopes.