Skip to content

simplifying a GLM-removing categorical variables

2 messages · mariannej, Ben Bolker

#
Hi,

Thanks for reading this mesage!

I have created a GLM (using the quasipoisson family) and am now trying to
simplify it.  One of my explanatory variables is categorical (vegetation
type, with 6 different levels).  In the model, 5 of the 6 levels are
significant and one is not. 

How should I simplify my model?  Do I need to take out the whole category
(i.e. all of vegetation type), or just the level that is not significant
(but how would I explain this biologically?)

Please spell out any anwers simply, I am new to R,

Thanks very much
Marianne
#
mariannej <marianne.james <at> abdn.ac.uk> writes:
This is really a statistical rather than an R question,
but the short answer is: you probably shouldn't try to
remove the "non-significant" level.  Depending on the
details of your model -- the "significance" of the parameters,
which I assume you're gleaning from summary(), refers 
to the difference of the levels from the baseline (first)
level.  If 5 out of the 6 levels are significantly different
from the baseline, then the factor belongs in the model.
(You could _conceivably_ try to lump the "non-significant"
level together with the baseline level, but this really
goes in the direction of data-dredging.)

   I would strongly recommend that you consult a good
general text on generalized linear models for strategies
of model simplification and interpretation -- to repeat,
this is really a statistical question and not an
R-specific one ...

  good luck,
    Ben Bolker