On Thu, Jan 28, 2010 at 6:57 AM, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
Dear Thierry,
I THINK the fixed effect slope should be what you're after if you want to
predict the change in log numbers, but simply exponentiating the prediction
will not give you a true measure of the arithmetic increase.
I too think that the fixed-effect slope should be an estimate of the
population slope on the log(count) scale, except for the usual
problems with counts of zero and, in this case, the (1|Year) random
effects term. I can appreciate that you may want to incorporate year
to year variability due to weather conditions in the model but I'm not
sure what the effect of that on the fixed effect for Year would be. I
could imagine an argument for them not interfering with each other
(the fixed effect is measuring the trend and the random effect
measures year-to-year variability around the trend line) but I am not
confident of that argument.
The arithmetic prediction for years 1:10 (for example) when the slope.
variance for the year|room term is zero would be:
exp(b_year*1:10+0.5*(v1+v2))
where b_year is your slope estimate, and v1 is the year intercept variance
and v2 is the room intercept variance.
When slope variance exists this becomes more difficult, because it implies
the variance v2 changes as a function of year. In this case:
v2=diag(Z%*%V2%*%t(Z))
where
Z<-cbind(rep(1,10), 1:10)
and V2 is the covariance matrix of the room intercept-slopes.
Or if you like
v2 = V2[1,1]+(1:10)*V2[1,2]*2+(1:10^2)*V2[2,2]
Another difficulty is the possibility that your missing data are not
"completely missing at random". By default lmer just seems to omit missing
data rather than dealing with it properly, but perhaps there is an argument
that can be passed to na.omit which suppresses this?
I'm not sure what you mean by "dealing with it properly". Are you
considering some form of imputation?
My general approach is that, because the methods in lmer allow for
unbalanced data, there would not be a purpose in imputing counts that
were not observed. I presume that when Number is observed the Year
and Room are also recorded (otherwise you should get rid of some of
the members of your field crew). The only benefit that I could
imagine for imputing cases that were not observed would be if the
computational methods required balanced data.
Perhaps I am misunderstanding what you are getting at here, Jarrod.
If so, then the less
strict assumption of "missing at random" can be made. In this latter case
the missing data only have to be random conditional on the observed data -
for example, if there were no bats in room A in year 1 which made the field
workers less inclined to visit room A in year 2 based on their knowledge of
the 1'st year's count.
Cheers,
Jarrod
Quoting "ONKELINX, Thierry" <Thierry.ONKELINX at inbo.be>:
Dear all,
We are modelling the total numbers of hibarnating bats in a fortress. We
have data of the number of bats per room spanning ten years. The main
problem is that not all rooms were visited each year. The fieldworkers
did not known or find all rooms and some rooms were not allways
accessible.
Some of the rooms were not counted in the early years and they contain a
rather high number of bats in the more recent years. So a glm on the
total observed number would be very biased. Therefore we would use a
mixed model on the numbers of bats per room. The model looks like:
glmer(Number ~ Year + (1|Year) + (Year|Room), family = poisson). Year is
the long-term trend. (1|Year) allows for year-to-year variability (due
to weatherconditions) and (Year|Room) allows for a random intercept and
slope per room.
Our main question about this model is the interpretation of the
long-term trend (fixed effect of Year). Given the model specification it
is the trend in an 'average' room from the population of rooms. Can we
assume that this trend equals the trend in the total number of bats in
the fortress. That would be the trend in to total observed numbers if we
could have investigated every room in every year.
Or is it better to use the model to simulate the total number of bats
and then model this simulated totals using a simple glm? Repeating the
simulations a large number of times would yield an average and
confidence intervals for the trend.
Best regards,
Thierry
------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium
Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data.
~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver
weer
en binden het INBO onder geen enkel beding, zolang dit bericht niet
?bevestigd is
door een geldig ondertekend document. The views expressed in ?this message
and any annex are purely those of the writer and may not be regarded ?as
stating
an official position of INBO, as long as the message is not ?confirmed by
a duly
signed document.
? ? ? ?[[alternative HTML version deleted]]