Skip to content

ZIP MCMCglmm - how to increase effective sample size?

13 messages · Pierre de Villemereuil, Jarrod Hadfield, dani

#
Hi,

There are about three way to increase effective sample size:
- increase the number of iterations
- use a prior with better properties
- change your model somehow (you might not always want to use that one...)

In your case, using a slightly more informative prior and the extended parameters prior might help? Something like:

priori <- list(R=list(V=diag(2), nu=1, fix=2),
               G=list(G1=list(V=diag(2)/2, nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*1000)),
						 G2=list(V=diag(2)/2, nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*1000))))

Hope this helps,
Pierre.
On Wednesday, 25 October 2017 18:45:46 NZDT dani wrote:
#
Hello Pierre,


Thank you so much for your message! I truly appreciate your help with the prior, I have exhausted all the other options and I was going to give up using MCMCglmm. I will give it a try and hopefully it works out!


Cheers,

Dani
#
Hi Pierre,

I tried using the new prior you suggested and I got this error:

Error in MCMCglmm(y ~ trait - 1 + at.level(trait,1):(x1+:
  prior list should contain elements R, G, and/or B only

I am not sure what to do about this:)
Any advice would be very much appreciated.

Thanks,
DaniNM
<http://aka.ms/weboutlook>
#
Just some parentheses issue:
priori <- list(R=list(V=diag(2), nu=1, fix=2),
               G=list(G1=list(V=diag(2)/2, nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*1000),
                      G2=list(V=diag(2)/2, nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*1000)))

Cheers,
Pierre

Le mercredi 1 novembre 2017, 17:18:05 NZDT dani a ?crit :
#
Hello Pierre and list members,


Thank you so much! The analysis with the new prior worked:) However, the effective samples are still small, so I am trying again the new prior with more iterations - will report back how my effective samples change.


Best regards, everyone!

DNM
#
You can look at the auto-correlation to guess how many more iterations are needed to increase your sample size. Something like:

library(coda)
autocorr.diag(mj$Sol)
autocorr.diag(mj$VCV)

Cheers,
Pierre
On Wednesday, 1 November 2017 20:14:15 NZDT dani wrote:
#
Thanks, Pierre! Will do that, too!

Best,
DNM
#
Hi,

I would try and diagnose why the mixing is bad first.

1) How many observations are there, what is their mean and how many are 
zero?
2) How many levels are there to group1 and group2?
3) What is the range of the latent variable for the zero inflation? 
Include pl=TRUE in the call to MCMCglmm and plot a histogram of

model$Liab[,(ncol(model$Liab)/2+1):ncol(model$Liab)])

4) why do you have random effects for zero-inflation but choose to 
ignore it in the fixed effects?

Cheers,

Jarrod
On 01/11/2017 20:31, dani wrote:

  
    
#
Hi Jarrod,


Thank you very much for your message!

I think one of the main issues is the high degree of autocorrelation. I tried to check for autocorrelation as Pierre suggested, but my computer ran out of memory.

Here are my answers to your questions:

1). There are 10523 observations  with a mean of 0.09; 95% of observations are zero.

value  N    raw %     valid % cumulative %
0 9946       94.52       94.52 94.52
1 372   3.54  3.54         98.05
2 112    1.06         1.06         99.12
3 58     0.55  0.55        99.67
4 22         0.21  0.21        99.88
5 7         0.07         0.07        99.94
6 4         0.04         0.04   99.98
7 1         0.01         0.01   99.99
8 1         0.01   0.01  100.00

missing 0 0.00

total N=10523 ? valid N=10523 ? x?=0.09 ? ?=0.44


2) group 1:  1768 level

    group 2:  1857  levels


3) I attached the histogram of the latent variable.


4) I would like a model with two random intercepts (group 1 and 2) and I am not sure how to do that with a zero inflation term.


Thank you so much for taking the time to help me with this.

Best regards,

DNM
#
Hi Dani,


Its hard to tell without seeing the data, but my guess is that the poor 
mixing is due to there being no zero-inflation (and so the zi latent 
variables are heading to -infinity). exp(-0.09)=0.91 gives the 
probability of a zero from a standard Poission which is pretty close to 
your observed 95%. Also the variance is about double the mean suggesting 
over-dispersion. Add an over-dispersion parameter to the Poisson process 
(which MCMCglmm does automatically) and you would have even more zeros, 
quite possibly close to 95%. As I said its hard to know without seeing 
the data. You could run a standard Poisson and use posterior predictive 
tests to obtain a) the number of expected zeros and b) the 95% upper 
quantile of the observations and see if they are compatible with your data.


You can now do posterior predictive checks more easily now in MCMCglmm.


sim<-simulate(poisson_model, nsim=1000)


nzeros<-apply(sim, 2, function(x){sum(x==0)})

uq<-apply(sim, 2, function(x){quantile(x, 0.95)})


hist(nzeros)

abline(v=sum(s25h$y==0))


if the statistics calculated from the data are in the body of the 
histograms then an over-dispersed Poisson is probably accurate.


Cheers,


Jarrod
On 02/11/2017 15:28, dani wrote:
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20171102/d6b128c2/attachment-0001.ksh>
#
Hi Jarrod,


Thank you so much for your detailed and extremely helpful message!


I tried to post the message below plus some images showing the histograms, but the message was too large, so I am only sending the text here. I will send the images separately.


_______________________________________________________________


Yes, I was a bit puzzled about zero-inflation and overdispersion, because I ran a GLMER - Poisson and  a GLMER - negative binomial as well as two glmmTMB (Poisson and negative binomial) models before getting a little bit more familiar with MCMCglmm.


I ran the glmmTMB models for both cases: for no zero inflation and for zero inflation. The tests for zero-inflation and overdispersion indicated the data presented zero-inflation and overdispersion, yet the AICs of the models were better for the no zero inflation models. I attached the histograms of the expected zeros for the two GLMER. I also ran all null models, including null models with only one random intercept or none, to be able to calculate ICCs and test the significance of the cross-classified random effects.


I attached a table showing the AICs and DICs for some of the models I ran. In the case of the models with zero inflation, some of the null models (with either one of the two random effects) did not converge, so I believe the zero-inflation models perform worse than the models that do not account for zero inflation.


In the case of the MCMCglmm models, again, as you suggested - the results seem better for the models without zero-inflation. The sample sizes are ok-ish (around 200 or so). I attached the histogram I obtained using the script you suggested.


It seems that MCMCglmm does not accommodate negative binomial distributions, is that right? In that case,


I would like to compare the three types of models (GLMER, glmmTMB, and MCMCglmm) for Poisson distributions and it looks like I should only include the models without zero inflation. I am not sure, though, if I need to further account for overdispersion in the case of the GLMER or glmmTMB models.

Thank you so much for all your help with this!
Greetings to all list members!
Best,
DNM
<http://aka.ms/weboutlook>
#
Hello again,


Please find attached the images for my previous message.

Thanks,

DNM
#
Hello again,


Sorry for sending so many messages, my messages cannot be sent given their size. I re-attached the images.

DNM