random vs fixed effects and glmer model simplification - R-SIG-mixed-models

Thu, Aug 14, 2014 8:11 AM #
Dear List,

I'm new to statistics and R so apologies for the beginner question.

I have a dataset with count data from a large sample of people and I am
trying to specify the most appropriate model. I am not sure whether I
should be using a mixed model or not. I have been more specific below, but
I suppose my general question is how do you decide whether a factor is
fixed or random? I read in Crawley's R book that generally fixed effects
vary in mean over factor levels whereas random effects vary in variance
over factor levels, but this definition does not seem to be consistent over
the various (and sometimes dubious) internet sources I've found on the
subject.

More specifically -

I have poisson-distributed data on the number of hours people spend doing
various activities with their children, and I have this data from various
years and countries, and from mothers and fathers, and I have data on how
many hours a week they work. So, I have come up with 2 potential models:

Model 1
glm(hours ~ parent * work + year + country, family="poisson")
# then go on to do model simplification with
anova(model,model2,test="Chisq")

Model 2
glmer(hours ~ parent * work + (1|year) + (1|country), family="poisson")

where hours is a poisson-distributed numeric ranging from 0-35, parent is a
two level factor (mother or father), work is a numeric ranging from 0-52,
year is a factor with 11 levels (2004-2014) and country is a factor with 6
levels (6 country names).

I suppose I have a few questions from this. First, does anyone know which
of these models is most appropriate? Given the Crawley definition above, my
data do vary in mean over year and country, so they should be fixed
effects, but as I mentioned above, I am not sure about this definition.
Second, if the best model is in fact the mixed model, how do I go about
model simplification for this? I read in the documentation that you
shouldn't just do model simpification on the model as it is, but should add
REML=F beforehand, but I get error messages for this.

Once again, please accept my apologies for any stupid questions - this is
all very new to me and I'd be grateful for any pointers and constructive
criticism!!

Thanks very much

Saoirse.

(PhD student)