Bootstrapping glmer random effects
Hi Joe, I am not sure if I can give you good advice on this but I can try. What I noticed about your outputs was that you had the same number of observations for your analysis on the original data and for the last bootstrap output. The number of cities varied only by 1. This looks like you used your individual observations as the resampling unit. However, as one other responder has mentioned already you should use cities as your resampling unit as you have non-independence between observations within each city. When using city as the resampling unit you need to include all observations from that city when a city is picked. You need to also remember to rename your cities after picking them. For example, if city1 is picked and has, say 4 observations you bring all 4 observations into your bootstrap data and call the city now city1_1 (for example). Then, when city1 is picked again, bring all 4 observations into your bootstrap data again and rename city1 to city1_2 for these 4 observations. The reason you want to do that is that city is your grouping factor for the random effect and you want to end up with the same number of different cities in your bootstrap data as in your original data (218 - I believe). I hope this will work. If you have questions about coding this up I might be able to help you. Cheers, Cornelia <>< <>< <>< <>< <>< <>< <>< Cornelia Oedekoven CREEM University of St Andrews cornelia at mcs.st-and.ac.uk www.creem.st-and.ac.uk <>< <>< <>< <>< <>< <>< <>< The University of St Andrews is a charity registered in Scotland : No SC013532 Quoting Joe King <joeking1809 at yahoo.com>:
Dear all
I am attempting to obtain a bootstrap confidence interval for the random
effect in a simple (random intercept) model using glmer.
The problem I have is that the interval I obtain consistently does not
contain the value I am trying to get an interval for ! For example I get the
following output when I run glmer on the full data:
Generalized linear mixed model fit by the Laplace approximation
Formula: wg~ (1 | city)
? ?Data: dt
? ?AIC ? BIC logLik deviance
?10115 10131 ?-5056 ? ?10111
Random effects:
?Groups ? Name ? ? ? ?Variance Std.Dev.
?city(Intercept) ? ? ? 0.14155 ?0.37623
Number of obs: 19318, groups: city, 218
Fixed effects:
? ? ? ? ? ? Estimate Std. Error z value Pr(>|z|) ??
(Intercept) -2.58566 ? ?0.04045 ?-63.93 ? <2e-16 ***
So I am trying to obtain the confidence interval for random effect variance :
0.14155. ?Yet, the confidence interval I got was ?0.2839343 , 0.3534999.
Moreover, the value in every one of the bootstrap replicates is greater than
0.14155. For example, the output from glmer in the last replicate the last
bootstrap replicate was
Generalized linear mixed model fit by the Laplace approximation
Formula: wg~ (1 | city)
? ?Data: sam
? ?AIC ? BIC logLik deviance
?10480 10496 ?-5238 ? ?10476
Random effects:
?Groups ? Name ? ? ? ?Variance Std.Dev.
?city(Intercept) ? ? 0.32769 ?0.57245
Number of obs: 19318, groups: city, 217
Fixed effects:
? ? ? ? ? ? ?Estimate Std. Error z value Pr(>|z|) ??
(Intercept) -2.58779 ? ?0.05142 ?-50.33 ? <2e-16 ***
There are no missing data. This is the code I have used to obtain the
interval:
for (i in 1:k) {
? ? sam <- dt[sample(nrow(dt), replace=T, size=nrow(dt)), ]
? ? m1<- glmer(wg~(1|city), data=sam, family=binomial) ??
? ? bs[i] <- VarCorr(m1)$city[1]
}
quantile(bs,c(0.025,0.975))
Could anyone suggest why this is happening, and what I might be able to do
about it ?
Thank you
JK
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
------------------------------------------------------------------ University of St Andrews Webmail: https://webmail.st-andrews.ac.uk