Skip to content

gls error

2 messages · Benjamin Gillespie, Ben Bolker

#
Morning all,

I wonder if anyone could shed some light on a problem I am receiving in R when I try to fit a model:

I'm attempting to follow the 'Protocol' as in Chapters 4 & 5 of Zuur et al 2009 for some data I have for a number of river sites, sampled once for macroinvertebrates. Each site has been graded into 1 of 3 groups dependent on it's characteristics. I want to find out whether the factor: "group" is significant.

I have a response variable: "simp" (simpsons diversity index) and a number of fixed factors that I would like to include in my model.

In R, this is the code I use:

f1=formula(simp~group+date+altitude+data_source+catchment_size+g1+g2+g3+g4+g5+g6+lc1+lc2+lc3+lc4+lc5)
s1.gls=gls(f1,data=env.sp)

Please note: g1...gX are %cover of geology types for each site and lc1...lcX are % land cover types for each site.

The following is the error I receive:

Error in glsEstimate(glsSt, control = glsEstControl) : computed "gls" fit is singular, rank 16
What would you suggest one should do in this instance?

Many thanks in advance for your advice,
		
Ben Gillespie
Research Postgraduate
#
Benjamin Gillespie <gybrg at ...> writes:
g1+g2+g3+g4+g5+g6+lc1+lc2+lc3+lc4+lc5)
lc1...lcX are % land cover types for each site.
"gls" fit is singular, rank 16
If you use a full set of compositional data as predictors (i.e. A,
B, C, D such that A+B+C+D=1) then you will necessarily have a
multicollinearity problem, even if the pairwise correlations between
the components aren't that high.  The correlation between any
component and the sum of all of the other components is exactly -1 (as
A+B+C increases, D must decrease).  You should leave one out
(preferably not a rare component, because if you leave a rare
component the remaining components will still be pretty strongly
multicollinear).  Alternatively, you could use something like
a log-ratio transform (see Aitchison and others) to transform
the n-dimensional compositional predictor to an (n-1)-dimensional
set of variables.

  (This isn't really a mixed model question ...)