Hello Ben and all,
Thank you Ben, very much for your previous reply. Yes, stand/plot makes
sense. After talking with a few statisticians, I got convergence! The
repeated measure issue is solvable with a dummy variable for each of the 3
years (like a fiscal or school year) - so yr1 = 2010smr+2011wtr, and so
on. Pellet plots were sampled 6 times, 3 each season, vegetation only
once. Among year variability can be high.
I'm now attempting to make predictive plots from models. I used UCLA's
method posted at their IRDA site that has a negative binomial example:
http://www.ats.ucla.edu/stat/r/dae/nbreg.htm
newdata2 <- cbind(newdata2, predict(m1, newdata2,type =
"link", se.fit=TRUE))
newdata2 <- within(newdata2, {
DaysAbsent <- exp(fit)
LL <- exp(fit - 1.96 * se.fit)
UL <- exp(fit + 1.96 * se.fit)
})
but I have an uneasy feeling about fitting standard errors from a
gaussian distribution.
In case it helps, what you're assuming is Normal is the *sampling
distribution of the parameters*, not the data themselves ...
The bottom line is that what you're doing seems reasonable, with
the caveats already expressed in ?predict.glmmadmb ...
In the past I've used MCMCglmm to estimate credible intervals around the
beta coefficients of a categorical variable, but now,
I have a continuous variable, conifer sapling count.
Well, technically it's a discrete (count) variable, not continuous ...
Section 2 of the glmmADMB vignette [vignette("glmmADMB")] has
a bit of information about running post-fit MCMC. There should
be an example of how to generate confidence intervals (basically,
you would generate predictions for each sampled set of coefficients
and then compute the quantiles or credible intervals of the
predictions), but there isn't (yet ...)
So, lots of questions!
How does one make an accurate predictive graph from a nbinom model?
Good question. Not easy, you may need to take some shortcuts
and hope for the best.
Does it make sense to set up a dataframe with the means of the
covariates when the covariates'
distributions were skewed count data?
It depends what values you want to predict for. The data frame
typically includes "typical" values of the parameters; you're welcome
to compute predictions for the medians instead if you prefer, or
for a range of values.
How do I use MCMCglmm correctly?
?? Do you mean the 'mcmc' argument of glmmadmb?
Is "se.fit" valid for the nbinom log link, or....?
[Update: I just wrote a bunch of stuff about how se.fit doesn't
work for glmmadmb, when it indeed does! However, I had already
written all the stuff below, which describes more generally how
you would do it by hand, so I'm just going to send it anyway]
The basic recipe for constructing confidence intervals, as laid out
in http://glmm.wikidot.com/faq in the "Predictions and/or confidence
(or prediction) intervals on predictions" section, is
(1) construct the model matrix (X)
* If you want predictions on the original model (not this case),
the model.matrix() accessor may work to extract it from the fit for you
* If not, or if you want to construct predictions for new data,
you need to call model.matrix() with the fixed-effect formula (you
may or may not be able to extract the fixed-effect part of the
formula only from the model, but it's usually fairly easy just
to respecify it
(2) extract the variance-covariance matrix of the fixed effect
predictors, V (usually just vcov(model))
(3) extract the fixed-effect coefficients (usually just fixef(model)
or coef(model))
then the predictions (on the linear predictor scale, at the population
level [i.e. ignoring all random effects]) are
X %*% beta
(plus an offset if there is one)
and the standard errors of prediction, *CONDITIONAL ON THE
RANDOM EFFECTS* [i.e. ignoring all uncertainty due to uncertainty
of the random effects] are
sqrt(diag(X %*% V %*% t(X)))
Then to get confidence intervals you should compute
inverse_link(fit +/- 1.96*se.fit) exactly as you have below
This assumes further that the sampling distribution of the
coefficients is Normal (in which case the linear computation
we did above will also lead to a Normal distribution, on
the linear predictor scale). This might not be true for
small or badly behaved data sets (of course it is never
exactly true except asymptotically ...), but there's not
much you can do about it without working much harder.