is multicollinearity of fixed effects resolved by random effects

I am not sure about how 800 SPECIES and 24 SITEs relate to each other
in this email. The following proposal assumes thtat a given SPECIES is
observed at each or at least quite a few of the 24 sites (i.e., I
assume you have several, ideally up to 24 measures for each species).
If each species occurs at precisely one site and if you have only one
measure for each species at this site (i.e., if you have excatly 800
observations), then SPECIES is not be part of the model. So let's
assume you have several measures for each species.

First, treat site and species as (possibly partially) crossed random
effects.  Assuming that the species in site 1 and site 2 are pretty
much the same. The nesting would be required if species 1 is a
different beast in site 1 and in site 2. Think of a student 1 in class
1 and student 1 in class2; they are likely different persons. However,
if you observe student 1 in a music class and the same student 1 in a
math class; they are obviously the same person. In lmer, if you give
units the same identification it will assume they are the same unit.

Second, definitely center your predictors prior to any analyses. Here
you need to think about the level at which you want to center and you
will need to read up on this. For starters, centering them on their
grand mean should not be wrong, but it may not be optimal. (This
becomes an issue especially if you want include covariates that
describe the sites, i.e., that are identical for all SPECIES observed
at a given SITE; actually I suspect that your three predictors are
such variables.)

Third, If collinearity is very strong, you may want to think of
combining your predictors into a single one or collaps two of them;
after all they seem to be getting at the same thing. See also John
Maindonald's suggestions on this topic. Definitely get a good idea
about how you predictors relate to the dependent variable. You may
also want to check whether high-order polynomials are likely required
to be in the fixed-effects part (e.g., quadratic trend for MAT, etc.).

mix.model1 <- lmer(X13C~ MAT+MAP+LAT + (1|SPECIES) + (1|SITE), method
="ML", data=ds)
mix.model2 <- lmer(X13C~ (MAT+MAP+LAT)^2 + (1|SPECIES) + (1|SITE),
method ="ML", data=ds)
and, possibly,
mix.model3 <- lmer(X13C~ MAT*MAP*LAT + (1|SPECIES) + (1|SITE), method
="ML", data=ds)

Further things:
After you feel happy with your fixed-effects part (or better: once you
developed a substantively and statistically defensible
representation), consider including (some of) them as varying slopes
into the random part of the model. I would start with those that have
the largest fixed effects;  e..g.

mix.model4 <- lmer(X13C~ MAT+MAP+LAT + (MAT|SPECIES) + (MAT|SITE),
method ="ML", data=ds)
or
mix.model5 <- lmer(X13C~ MAT+MAP+LAT + (MAT|SPECIES) + (1|SITE),
method ="ML", data=ds)
or
mix.model6 <- lmer(X13C~ MAT+MAP+LAT + (1|SPECIES) + (MAT|SITE),
method ="ML", data=ds)

etc.

Definitely check the distribution of the residuals of your final
model(s). You may need to think about a transformation of your
dependent variable.

One question for John Maindonald: Why would you include SPECIES as a
fixed effect? It leads to 799 parameters being estimated. Or am I
missing something here.

Reinhold Kliegl

On Sun, May 18, 2008 at 3:07 AM, John Maindonald

is multicollinearity of fixed effects resolved by random effects

Thread (10 messages)