multiple variance structure in lmer giving zero variances
Sara Krause <skkrause at ...> writes:
I'm hoping someone can help me out. Forgive me if I my mistake is something simple. I am new to mixed models, new to R, and new to lme4 and am struggling to figure everything out. I have two questions that I am hoping someone can answer. 1) Am I using the correct random structure for my model? 2) Can someone help me figure out what is wrong with my syntax to code for random effect variance by treatment group?
(I can understand your desperation, but please don't cross-post ... I almost answered this on r-help with a suggestion that you try on r-sig-mixed instead.)
These questions somewhat go together but let me tackle number one first. I was told by our statistician (who unfortunately doesn't know R well) that my model should include random effects for ID, ID*state, and ID*season. If my understanding of lme4 code is correct, my random structure would appear as it does in this model M1<-lmer (y~trt*season*state*site+(1|ID)+(1|state)+(1|season))
[snipped from below]:
There are two levels in each of these factors. Treatment is G or S (basically treated or control groups) and state is either Pre-treatment or Post-treatment. My other fixed factors are a site factor (two locations) and a season factor (breeding or non-breeding season). Is this correct? (I am starting with the fully crossed fixed effects and I will use iterative model selection to find the optimal model after I make sure that I have the correct random terms)
In general, categorical predictors should not appear in both the fixed and random parts of the model -- that constitutes overfitting. Depending on the structure of your data you should also strongly consider including interactions between treatment and your random factors (see Schielzeth, Holger, and Wolfgang Forstmeier. 2009. ?Conclusions Beyond Support: Overconfident Estimates in Mixed Models.? Behavioral Ecology 20 (2):416?420. doi:10.1093/beheco/arn145. http://www.ncbi.nlm.nih.gov/pubmed/19461866.). However, you can only include treatment interactions that make sense. It sounds like each individual is allocated to a single treatment, so (guessing that your statistician means that you want the *interactions* of ID with state and season) I think you should probably use y~trt*season*state*site+(state|ID)+(0+season|ID) which includes a random effect of state, a state:ID interaction (variation of the state effect among individuals), and a season:ID interaction (variation of seasonal effect among individuals). Why are you moving to lmer? is it so that you can handle the separate effects of state:ID and season:ID ?
The second question is a bit more complicated and changes my random structure as well. I had originally built my model in nlme and used the multiple variance (varIdent) function to allow different variance for two of my terms (trt and state) in my nlme model because different levels in each term had different variances. I found a very helpful post ( https://stat.ethz.ch/pipermail/r-sig-mixed-models/2007q3/000248.html) on how to also do this in lmer but it doesn't seem to be working correctly for me and I have not been able to figure out why.
I'm mostly going to postpone comment on your second question until your first question is sorted out, although I would also encourage you to see if your heteroscedasticity problem might be handled by transformation -- most typically, if treatment increases both the mean and variance a lot, then log transforming the data might take care of the problem with less hassle ...
I am also trying to do this with 2 factors instead of one. The overall structure of my data is Structure of my dataset (d) ..@ frame :'data.frame': 173 obs. of 10 variables: .. ..$ y : int [1:173] 209 382 448 353 224 112 198 273 495 622 ... .. ..$ trt : Factor w/ 2 levels "G","S": 1 1 1 1 1 1 1 1 1 1 ... .. ..$ season: Factor w/ 2 levels "Breeding","NonBreeding": 2 2 2 2 2 2 2 1 1 1 ... .. ..$ state : Factor w/ 2 levels "Post","Pre": 1 1 1 1 1 1 1 1 1 1 ... .. ..$ site : Factor w/ 2 levels "Mrak","Orchard": 1 1 1 1 1 2 2 1 1 1 ... .. ..$ G : num [1:173] 1 1 1 1 1 1 1 1 1 1 ... .. ..$ ID : Factor w/ 59 levels "100","103","105",..: 11 19 22 58 6 36 50 11 20 24 ... .. ..$ S : num [1:173] 0 0 0 0 0 0 0 0 0 0 ... .. ..$ Pre : num [1:173] 0 0 0 0 0 0 0 0 0 0 ... .. ..$ Post : num [1:173] 1 1 1 1 1 1 1 1 1 1 ...
m1<-lmer(y~trt*season*state*site+(0+G|ID)+(0+S|ID) +(0+Pre|ID) + (0+Post|ID)+(1|season), data=d)
Your basic problem here is that it doesn't make sense to test the interaction of treatment with ID (as you have effectively done here), because each individual is only in one treatment category ...