How to know if random intercepts and slopes are, necessary for glmer.nb model
On 20/10/2015 11:32, David Jones wrote:
Dear Alain - Thank you for these suggestions. In response to your questions: Poisson GLMM equivalents often run for these models (though I get the warning, "Model is nearly unidentifiable: very large eigenvalue"). For the models that do fit without problems, overdispersion tests do reflect overdispersion, and negative binomial model equivalents reflect better fit based on a chi-square comparison of -2LL.
David...a better LL (or significant Chi-square test) is not an excuse for applying an NB GLM or NB GLMM. Overdispersion can be caused by at least 10 different causes (and each requiring a different solution)...and you need to pinpoint what is driving the overdispersion. If you pick the wrong cause, then you may up with a wrong model. However....you state that the overdispersion is due to a few patients who stay for a long time in the hospital. That would be an argument in favour of using the NB GLMM. But also for using a Poisson GLMM with an observation level random intercept. The later one is much faster to estimate. It is not my favourite model....but being pragmatic......it is perhaps the way forward. Setting the theta to a fixed value in glmer.nb will certainly help. Kind regards, Alain PS...is length of stay in a hospital not strictly positive? Not that I want to suggest to use a zero truncated distribution for a data set with 500,000 observations....:-)
The DV is length of stay in hospital and the overdispersion is due to
some patients who stay for a very long time. For hospital count, there
are over 150 hospitals.
//
On Tue, Oct 20, 2015 at 6:18 AM, Highland Statistics Ltd
<highstat at highstat.com <mailto:highstat at highstat.com>> wrote:
----------------------------------------------------------------------
Message: 1
Date: Mon, 19 Oct 2015 08:59:40 -0400
From: David Jones <david.tn.jones at gmail.com
<mailto:david.tn.jones at gmail.com>>
To: r-sig-mixed-models at r-project.org
<mailto:r-sig-mixed-models at r-project.org>
Subject: [R-sig-ME] How to know if random intercepts and
slopes are
necessary for glmer.nb model
Message-ID:
<CAJgUswL0mkbgpv-Xt1MsPtVbm9qGUZ+uaJ+wugPZw8Dvh-XcLA at mail.gmail.com
<mailto:CAJgUswL0mkbgpv-Xt1MsPtVbm9qGUZ%2BuaJ%2BwugPZw8Dvh-XcLA at mail.gmail.com>>
Content-Type: text/plain; charset="UTF-8"
I am receiving a number of different warnings/errors when
running glmer.nb
on a fairly large dataset (N>500,000). For some of the models
I have run,
program-reported errors prevent the generation of estimates. I
suspect that
it is because the random effects are very small. I have tried
models with
random intercepts, as well as models with both random
intercepts and slopes
(all models include fixed effects). I am running models on a
dataset which
in theory would include random effects (patients nested within
hospitals).
My question is: how do you know if random intercepts and
slopes are
necessary, if you can't even estimate the random effects
models (and thus
use a model comparison test)? As I am aware you can look at
design effects
to evaluate if a random intercept is necessary (though please
correct me if
I am wrong here).
Some example code I have used is below - many thanks.
a2 <- as.factor(analysis$Location)
NBIntercept<- glmer.nb(y ~ a2 + (1 | Hospital), data = analysis)
NBInterceptSlope <- glmer.nb(y ~ a2 + (1 | Hospital) + (1 + a2
| Hospital),
data = analysis)
[[alternative HTML version deleted]]
David....this is a little bit a 'Gandalf' question. Perhaps you
should first figure out why the NB GLMM does not run. How many
hospitals do you have. Perhaps you can set the theta parameter in
glmer.nb to a fixed value (use an interval with nearly the same
lower and upper limit).... and get the (log of ) theta from a
nearby NB GLM model. That would certainly make the estimation
process easier!
Why are you doing an NB GLMM? Do the Poisson GLMM equivalents run?
I assume you had overdispersion. What was driving the overdispersion?
And if computing time is slow for the second NB GLMM model, fit
the first model and see whether there are any a2 effects per
hospital in the residuals of the first model.
Alain
--
Dr. Alain F. Zuur
First author of:
1. Beginner's Guide to GAMM with R (2014).
2. Beginner's Guide to GLM and GLMM with R (2013).
3. Beginner's Guide to GAM with R (2012).
4. Zero Inflated Models and GLMM with R (2012).
5. A Beginner's Guide to R (2009).
6. Mixed effects models and extensions in ecology with R (2009).
7. Analysing Ecological Data (2007).
Highland Statistics Ltd.
9 St Clair Wynd
UK - AB41 6DZ Newburgh
Tel: 0044 1358 788177
Email: highstat at highstat.com <mailto:highstat at highstat.com>
URL: www.highstat.com <http://www.highstat.com>
Dr. Alain F. Zuur First author of: 1. Beginner's Guide to GAMM with R (2014). 2. Beginner's Guide to GLM and GLMM with R (2013). 3. Beginner's Guide to GAM with R (2012). 4. Zero Inflated Models and GLMM with R (2012). 5. A Beginner's Guide to R (2009). 6. Mixed effects models and extensions in ecology with R (2009). 7. Analysing Ecological Data (2007). Highland Statistics Ltd. 9 St Clair Wynd UK - AB41 6DZ Newburgh Tel: 0044 1358 788177 Email: highstat at highstat.com URL: www.highstat.com [[alternative HTML version deleted]]