Answers to "Fixing heteroscedasticity in mixed-effects model?"
Hi all, Just to summarize the very helpful set of answers I got to my query (see below for the problem description). 1) Specify a variance structure to account for heterogeneity of residuals across different values of the explanatory variables (e.g., weights = varPower() in the lme() function the nlme package). This book is a fantastic resource: Zuur AF, Ieno EN, Walker NJ, Saveliev AA and Smith GM (2009) Mixed Effects Models and Extensions in Ecology with R. Springer Science and Business Media, New York, NY 2) Try negative binomial (probably won't converge) 3) If you want to go the Gamma route, you can try (1) the development version of lme4 (install from r-forge, but it might be broken right now); (2) glmmADMB 4) Double check exactly which residuals you are getting before worrying further about heterscedasticity. In a linear mixed model (Y = Xb + Zu + e), there are two definitions of the residual: Y-X bhat and Y - (X bhat + Z uhat). You want the second to investigate unequal variance of the e's. 5) Use a stronger transformation, like Y^(-1/2) or Y^(-1) 6) Investigate whether it's reasonable to ignore the heterogeneity. See paper by Phil Dixon and David Fletcher, March 2012 Methods in Ecology and Evolution. I ended up choosing the first route (the route suggested by everyone who replied), following Zuur and using AIC to select a formula for weights, then choosing random effects, and finally choosing fixed effects. I ended up with: mod <- lme(logCd ~ logRe + Hab + logRe:Hab3 , random = ~1+logRe|Study, weights=varExp(form=~logRe|Study)) Including Study in the variance function greatly increases the number of model parameters, but AIC, BIC, and AICc all vastly prefer it over any other formulation I tried. Please drop me an email if anything looks fishy in that model. Thanks again, Malin
On Fri, Mar 23, 2012 at 1:57 PM, Malin Pinsky <malin.pinsky at gmail.com> wrote:
Hi all, I'm having problems fitting a mixed-effects model for an ecological meta-analysis, and I'm curious if anyone has advice. In particular, it's pretty clear that the variance in the residuals increases with the predicted mean, but my normal fixes don't seem to be working. The model is: mod1 <- lmer(logCd ~ logRe + Hab + logRe:Hab + (logRe|Study), data=temp) where Cd is a drag coefficient (>0 before log-transformation), Re is a physical quantity called a Reynolds number (also >0 before transformation), Hab is a categorical variable for habitat, and Study is a categorical variable for the study the data came from. I know from fluid dynamics theory that logCd and logRe can be linearly related, but I expect that the slope and intercept vary between habitat types and between studies. A plot of plot(fitted(mod1), resid(mod1)) looks like a nice cone with the highest spread on the right-hand side of the graph. My initial thought was to use a different transformation for Cd, but I couldn't find a power that made any difference for the problem of increasing variance. I then tried a Gamma error, thinking that this corrects for problems of increasing variance. I had to change the transformation of Cd slightly, now using log(Cd+1) so that there were no negative values: mod1 <- lmer(logCd ~ logRe + Hab + logRe:Hab + (logRe|Study), data=temp, family=Gamma) but the model won't converge. In particular, I get "Warning in mer_finalize(ans) : singular convergence (7)." My dataset has 686 observations across 4 habitat levels and 24 studies. Is there something obvious I'm missing, or are the other avenues to try? Any advice would be welcome. I'm using lmer() from the lme4 package in R 2.14.1 on Mac OS X 10.6.8. And, if this belongs on the R-sig-ME list, let me know. Thanks, Malin -- Malin Pinsky David H. Smith Conservation Research Fellow Department of Ecology and Evolutionary Biology Princeton University malin.pinsky at gmail.com
Malin Pinsky David H. Smith Conservation Research Fellow Department of Ecology and Evolutionary Biology Princeton University malin.pinsky at gmail.com