glmm (binomial, logit) with transformed/scaled predictors
Johannes Radinger <johannesradinger at ...> writes:
Hi,
First I'd like to apologize for probably very novice questions: I am new to this list as well as rather new in the field of mixed model. Jobwise, I am aquatic/river ecologist (actually PhD student) and not a statistician/mathematician and so most of my questions are probably related to the topic of rivers/fish.
Actually, this is actually another not-really-about-mixed-models question (see below).
I want use a GLMM with presence/absence data (logit-model) as response. The model contains the binary response (presence/absence of a species at a site) and 3 continuous predictors (fixed effects: habitat quality, dispersal metric, metric of influence of barriers) as well as two random effects (in my case species and sites). For that purpose I am using the R-function (g)lme from the package lme4.
That's (g)lmer ...
Now several questions appeared: 1) My predictors are partly highly skewed:
summary(Pred1)
Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0000 0.0000 0.1143 9.0720 3.8400 616.4000
summary(Pred2)
Min. 1st Qu. Median Mean 3rd Qu. Max. -7.44400 -0.00031 0.00000 0.41560 0.00019 255.70000
summary(Pred3)
Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0000 0.1221 0.3716 0.3734 0.5914 0.9626 In a standard regression model they would definitely need transformation (e.g. log). Is that the first step in a logistic mixed model too?
This is a common misconception. The distribution of the predictors _is not relevant_ to the correctness of a linear or generalized (or LMM or GLMM) model (or additivity of predictor effects); you might want to transform the predictors in order to improve the _linearity_ of the model (on the linear predictor scale, i.e. the logit scale in this case), but there is _a priori_ nothing wrong with a skewed distribution of predictors.
2) As I am interested in a way to compare the relative size effects of the three predictors I had been referred to the simple approach of Schielzeth 2010 (thank you for that tip Mr. Bolker!). In this article it is recommended to scale the predictors for comparison of their importance and/or to scale and center in case one also wants to model interactions between continuous predictor variables. So my question how would that interfere with any transformation of the variables as described in my first question? And what is the correct order: first transformation than scaling?
If you find it useful to transform your predictors, it will probably be easier to transform first and then center/scale.
3) After reading some list posts like this here: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/015591.html I learnt that comparing two lmer-models (likelihood ratio test with anova()) is the way to find out if a predictor or interaction is significant in a model. Thus this method (comparing complex with less complex) models can lead to the most parsimonious model comprising only of the most important predictors and interactions. After reading Schielzeth 2010, I think this comparison test should be made with centered and scaled variables (and if needed transformed before)?
In principle, centering and scaling variables (only) should not affect the overall fit/log-likelihood of a model, it just eases interpretation. So the answer is "it doesn't matter".
4) This brings me than to my final interpretation of the results. In relation to that I came across the post as mentioned before and a tutorial: http://www.ats.ucla.edu/stat/mult_pkg/faq/general/odds_ratio.htm I think that tutorial really helped me to understand the meaning of odds etc. But how can odds be interpreted if the predictors were transformed before?
I don't think transforming the _predictors_ changes the interpretation of the _response variable_ ...
First I'd calculate the final parsimonious model once with transformed and scaled predictors to get an idea about the relative impact of each independent predictor. Second, the same model with transformed but with unscaled predictors can provide absolute parameter estimates for the odds.
Something like that.