Skip to content

zero-inflated models in MCMCglmm

4 messages · Jarrod Hadfield, Gustaf Granath

#
Hi,

Thanks for looking at this Jarrod. I have included R code below and data 
is loaded from github.

I realized that I had missed one thing though. I must have suppressed 
warnings, because I now see that my zip-model gives a warning about not 
being able to estimate all fixed effects (thus removed). I get no 
warnings for poisson and zap models. The warning is telling my that 
there is a reason why the contrasts are messed up, but I havent been 
able to pinpoint whats going on though (I have checked that the 
experimental design is correct etc). This is likely very case specific, 
but I'd appreciate any qualified guess that can help me.

Cheers

Gustaf

#################
# test data
require(RCurl)
zero.dat 
<-read.csv(text=getURL("https://raw.githubusercontent.com/ggranath/zero_inf_models/master/test_data.csv"),header=T)
#zero.dat <- read.csv("test_data.csv")

#plots nested in site (X3 treatments at plot-level)
zero.dat$nested_plot <- factor(with(zero.dat, paste(site, plot, sep= 
"_")) )

#zip model
nitt = 2000 #low for testing
thin = 10 #low for testing
prior = list( R = list(V = diag(2), nu = 0.002, fix = 2),
               G=list(G1=list(V=1, nu=0.002,alpha.mu=0, alpha.V=625^2)))
zip <- MCMCglmm(y ~ trait -1 + at.level(trait, 1):(X1*X2*X3_nest),
                 random = ~ nested_plot, rcov = ~idh(trait):units,
                 data=zero.dat, family = "zipoisson",  nitt = nitt,
                 burnin = 100, thin=thin, prior=prior, pr = TRUE, pl = TRUE)
summary(zip)
# Gives a warning!! And coef table seems messed up.
# Something is going on here.....dont know what though.

#zap model
zap <- MCMCglmm(y ~ trait*(X1*X2*X3_nest),
                     random = ~ nested_plot, rcov = ~idh(trait):units,
                     data=zero.dat, family = "zapoisson",  nitt = nitt,
                     burnin = 100, thin=thin,  prior=prior, pr = TRUE, 
pl = TRUE)
summary(zap)
# No warning and everything looks good

# Poisson model
prior = list( R = list(V = diag(1), nu = 0.002),
               G=list(G1=list(V=1, nu=0.002,alpha.mu=0, alpha.V=625^2)))
pois <- MCMCglmm(y ~ X1*X2*X3_nest,
                 random = ~ nested_plot,
                 data=zero.dat, family = "poisson",  nitt = nitt,
                 burnin = 100, thin=thin,  prior=prior, pr = TRUE, pl = 
TRUE)
summary(pois)
# No warning

# Some marginal predictions
p.zip <- predict(zip, marginal = ~ nested_plot,  type="response", 
posterior = "mean")
p.zap <- predict(zap, marginal = ~ nested_plot,  type="response", 
posterior = "mean")
p.pois <- predict(pois, marginal = ~ nested_plot, type="response", 
posterior = "mean")
cbind(aggregate(y ~ X1*X2*X3_nest, zero.dat, mean),
       zip = aggregate(p.zip ~ fire*retention*micro_hab.two, dat, mean)$V1,
       zap = aggregate(p.zap ~ fire*retention*micro_hab.two, dat, mean)$V1,
       pois = aggregate(p.pois ~ fire*retention*micro_hab.two, dat, 
mean)$V1)
# poisson model gives extreme (unrealistic) values.
On 2016-11-30 21:21, Jarrod Hadfield wrote:
#
Hi Gustaf,

I don't have dat, so I can't run the final bit of code.

However, these are my thoughts.

1/ In the zap model you are allowing X1 to X3 to effect the level of 
zero alteration, whereas in the zip model you are just fitting a 
constant level  of zero inflation across X1 to X3. In this sense the zap 
model will provide a superior fit because it has more parameters. The 
warning message about dropping terms is fine, although the default 
contrast for the zip model is a bit annoying: I would have preferred the 
main effect for X1 to be yes rather than no, but I guess its no big deal.

2/ You have fitted a single nested_plot effect in the random effects. 
This is a little odd, except in the case where the data are not 
zero-inflated. In this case having a single nested_plot term in the zap 
model is equivalent to fitting a nested_plot term in the standard 
Poisson. If the data are zero-inflated, and/or the model is not a zap 
model, then the case for having a single term is not well justified. I 
would use us or idh and in the latter case consider fixing the second 
variance (associated with zero-inflation/alteration) close to zero.

3/ The marginal predictions do not take into account the covariances 
between traits. This is generally OK, but its not ideal when the traits 
refer to the multiple parameters of a single distribution as with 
za/zi/hu models. I should put a warning in. You can also use the 
simulate function on the model and then average over the draws to get 
the predictions, and this will take into account any covariances. Note 
that if you use idh(trait):nested_plot there are no covariances so 
predict should be fine.

Cheers,

Jarrod
On 30/11/2016 23:30, Gustaf Granath wrote:

  
    
#
Jarrod,

Thanks for explaining this. I now start to understand how these models 
are set up in MCMCglmm. The weird contrasts seem impossible to get rid 
of though, and Im still struggling a bit to work out how to combine the 
effects.

Great idea to use the simulation function. I used it for predictive 
model checking of zeros, and I do have one question. Does simulate() 
include (marginalise) the residual error (units)? I get quite different 
results if I use a "by hand" simulation function (from course notes) and 
simulate(), for a standard Poisson model.

Cheers

Gustaf

####### R code

# get test data
require(MCMCglmm)
require(RCurl)
zero.dat 
<-read.csv(text=getURL("https://raw.githubusercontent.com/ggranath/zero_inf_models/master/test_data.csv"),header=T)

#plots nested in site (X3 treatments at plot-level)
zero.dat$nested_plot <- factor(with(zero.dat, paste(site, plot, sep= 
"_")) )

#zip model
nitt = 10000 #low for testing
thin = 10 #low for testing
burnin = 1000 #low for testing
prior = list( R = list(V = diag(2), nu = 0.002, fix = 2),
               G=list(G1=list(V=diag(2)*c(1,0.001), nu=0.002, fix=2)))
zip <- MCMCglmm(y ~ trait -1 + at.level(trait, 1):(X1*X2*X3_nest),
                 random = ~ idh(trait):nested_plot, rcov = 
~idh(trait):units,
                 data=zero.dat, family = "zipoisson",  nitt = nitt,
                 burnin = burnin, thin=thin, prior=prior, pr = TRUE, pl 
= TRUE)
summary(zip)
# Gives a warning!! And contrasts are not straight forward to interpret.

#zap model
prior = list( R = list(V = diag(2), nu = 0.002, fix = 2),
               G=list(G1=list(V=1, nu=0.002,alpha.mu=0, alpha.V=625^2)))
zap <- MCMCglmm(y ~ trait*(X1*X2*X3_nest),
                     random = ~ nested_plot, rcov = ~idh(trait):units,
                     data=zero.dat, family = "zapoisson",  nitt = nitt,
                     burnin = burnin, thin=thin,  prior=prior, pr = 
TRUE, pl = TRUE)
summary(zap)
# No warning and everything looks good

# Poisson model
prior = list( R = list(V = diag(1), nu = 0.002),
               G=list(G1=list(V=1, nu=0.002,alpha.mu=0, alpha.V=625^2)))
pois <- MCMCglmm(y ~ X1*X2*X3_nest,
                 random = ~ nested_plot,
                 data=zero.dat, family = "poisson",  nitt = nitt,
                 burnin = burnin, thin=thin,  prior=prior, pr = TRUE, pl 
= TRUE)
summary(pois)
# No warning

# Some predictions
# by default, all random factors are marginalised
p.zip <- predict(zip,   type="response", posterior = "mean")
p.zap <- predict(zap,  type="response", posterior = "mean")
p.pois <- predict(pois,   type="response", posterior = "mean")
cbind(aggregate(y ~ X1*X2*X3_nest, zero.dat, mean),
       zip = aggregate(p.zip ~ X1*X2*X3_nest, zero.dat, mean)$V1,
       zap = aggregate(p.zap ~ X1*X2*X3_nest, zero.dat, mean)$V1,
       pois = aggregate(p.pois ~ X1*X2*X3_nest, zero.dat, mean)$V1)
# poisson model give extreme (unrealistic) values. zap best I think.


# Predictive testing: zeros

# zip model
oz <- sum(zero.dat == 0)
sim.zi <- simulate(zip, type="response", posterior = "all", nsim=1000)
dist.zeros.zi <- apply(sim.zi, 2, function (x) sum(x==0))
hist(dist.zeros.zi)
abline(v = oz, col = "red")

# zap model
oz <- sum(zero.dat == 0)
sim.za <- simulate(zap, type="response", posterior = "mean", nsim=1000)
dist.zeros.za <- apply(sim.za, 2, function (x) sum(x==0))
hist(dist.zeros.za)
abline(v = oz, col = "red")
# very good prediction of zeros!

# Poisson model
oz <- sum(zero.dat == 0)
sim.pois <- simulate(pois, type="response", posterior = "mean", nsim=1000)
dist.zeros.pois <- apply(sim.pois, 2, function (x) sum(x==0))
hist(dist.zeros.pois, xlim=c(900,1250))
abline(v = oz, col = "red")
# Zero-inflated!

# Simulation from pois model, by hand.
mc.samp <- nrow(pois$VCV)
nz <- 1:mc.samp
oz <- sum(zero.dat == 0)
for (i in 1:mc.samp) {
   pred.l <- rnorm(nrow(zero.dat), (cbind(pois$X,pois$Z) %*% 
pois$Sol[1,])@x, sqrt(pois$VCV[i,]))
   nz[i] <- sum(rpois(nrow(zero.dat), exp(pred.l)) == 0)
}
hist(nz, xlim=c(900,1250))
abline(v = oz, col = "red")


# plot zero distributions
zeroDens <- data.frame(y = c(dist.zeros.zi, dist.zeros.za, 
dist.zeros.pois, sample(nz,1000)),
                        model = rep(c("zi", "za", "pois", 
"pois.byHand"), each=1000))
library(ggplot2)
ggplot(zeroDens,aes(x=y, fill=model)) + geom_density(alpha=0.25) + 
geom_vline(xintercept=oz)
# only ZAP that captures all the zeros. Zip not that bad though.
On 2016-12-01 07:48, Jarrod Hadfield wrote:
#
Hi,

The automatic contrasts from model.matrix are sometimes a bit illogical: 
you can just multiply  at.level(trait, 1):X1no by -1 to get 
at.level(trait, 1):X1yes.

simulate is not simulating expectations but new observations. If ~ 
idh(trait):nested_plot is specified in the marginal argument new levels 
of nested_plot are simulated (a new one for each observation) if the 
marginal argument is NULL then the actual levels of nested_plot are used.

To get the expectations as in predict simulate 1000 (for example) draws 
and then take the rowMeans to get an average over the distribution of 
random and residual effects.

Cheers,

Jarrod
On 01/12/2016 22:36, Gustaf Granath wrote: