Dear Ben and Pierre,
Thank you to you both for the new solution suggestions and inputs. These
help a lot and have improved my model a lot! I tried to apply your newest
suggestions to my model.
I wanted to double-check if I understand the prior suggestion correctly.
While I do understand the basics of the priors to some extent, I still have
difficulties to look at a model and say which prior would be appropriate. I
thought that the probability density distribution of the prior is flatter
the smaller nu is, i.e. the smaller nu, the weaker the prior. By
recommending to make the prior stronger, do you mean to increase nu? If so,
I have nu=1000 in the G part of the prior formula, which I thought was a
very strong prior already? Or did you mean to increase nu in the R-part of
the prior specification?
I ran the model with the full data set, i.e. none of the animals have been
removed, resulting in 424 observations across 28 individuals.
I read somewhere that the ratio of (NITT-BURN)/THIN should be kept between
1000-2000. So, my question is, if as long as this ratio is between
1000-2000 is it ok to run the model for very long ? or is that trying too
hard to fit the model?
I ran a model with NITT = 920000, BURN = 20000, THIN = 600, i.e.
(NITT-BURN)/THIN = 1500. I attached the details below and added you the
trace and density plots (I removed them for the list email as it would make
the message too large). The ZI-related variance in the baitIntake is
slightly larger again (12 compared to 9 in the previous model where I
excluded some animals). The effective sample sizes are larger than 200 for
all ZI-related variances but still rather small for zi_baitIntake.id (412)
and zi_baitIntake.event (303). The autocorrelation is also rather large for
some of the ZI-related variances.
a) Formula, chain parameters and prior
priorP2 <- list(R = list(V = diag(2), nu = 2, fix = 2),
G = list(
G1 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0), alpha.V
= diag(2))
,G2 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0), alpha.V
= diag(2))))
NITT <- 920000
BURN <- 20000
THIN <- 600
MULT <- 10
model6.0 <- MCMCglmm(bait ~ trait - 1
+ trait:animals + trait:tourists + trait:operators
+ at.level(trait,1):presence
, random = ~idh(trait):id + idh(trait):event
, rcov =~ idh(trait):units
, data=df, family="zipoisson", prior=priorP2
, nitt=NITT*MULT, burnin=BURN*MULT, thin = THIN*MULT
, verbose=FALSE)
b) Summary
baitIntake zi_baitIntake baitIntake:animals zi_baitIntake:animals
baitIntake:tourists zi_baitIntake:tourists
Lag 0 1.00000000 1.000000000 1.000000000
1.000000000 1.000000000 1.00000000
Lag 6000 -0.03673458 0.205319459 0.011391819
0.461760213 0.011959233 0.41034495
Lag 30000 0.00868412 0.009355555 -0.004042761
-0.015145382 -0.006931932 0.07049628
Lag 60000 -0.03167273 0.051729288 0.003874329
0.031838052 -0.017647052 0.02593432
Lag 3e+05 0.02904094 0.034563568 0.011200624
-0.005617972 0.029031181 0.01240007
baitIntake:operators zi_baitIntake:operators presence
Lag 0 1.0000000000 1.0000000000 1.000000000
Lag 6000 -0.0403305740 0.3661869833 -0.008982195
Lag 30000 -0.0583269879 -0.0004637806 -0.011342973
Lag 60000 -0.0075806959 0.0026802367 0.019267725
Lag 3e+05 -0.0009343694 0.0035771025 0.065613628
autocorr.diag(model6.0$VCV)
baitIntake.id zi_baitIntake.id baitIntake.event
zi_baitIntake.event baitIntake.units zi_baitIntake.units
Lag 0 1.00000000 1.00000000 1.000000000
1.00000000 1.000000000 NaN
Lag 6000 0.01695959 0.19236401 -0.002904061
0.61268299 0.001951758 NaN
Lag 30000 0.02745489 0.11551187 -0.004205527
0.16867050 -0.018719149 NaN
Lag 60000 -0.01129109 0.09528737 0.009901908
0.05404658 0.004567198 NaN
Lag 3e+05 0.03131679 0.03816832 -0.031074398
0.02954608 0.021244502 NaN
However, I also read some more publications this morning that used MCMCglmm
and looked at their nitt, burn and thin parameters. A lot of them result in
(NITT-BURN)/THIN = 1000. I therefore re-run my model with parameters that
also had 1000 as a ratio.
The model showed some increase again in the variance (it is at 12 now for
the ZI-related variance). I think the ID does explain the 0?s a lot. Nearly
half of the animals never ate and within the animals that ate at least
once, values from 0.16 to 16.16 were observed. The autocorrelation
diagnostics showed less autocorrelation than in the model yesterday and in
the model above. The smallest effective sample size was 496. I added you
all the details below.
a) Model formula, prior and chain parameters
priorP2 <- list(R = list(V = diag(2), nu = 2, fix = 2),
G = list(
G1 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0),
alpha.V = diag(2))
,G2 = list(V = diag(2), nu = 1000, alpha.mu = c(0,0),
alpha.V = diag(2))))
NITT <- 920000
BURN <- 20000
THIN <- 900
MULT <- 10
model7.0 <- MCMCglmm(bait ~ trait - 1
+ trait:animals + trait:tourists + trait:operators +
at.level(trait,1):presence
, random = ~idh(trait):id + idh(trait):event
, rcov =~ idh(trait):units
, data=df, family="zipoisson", prior=priorP2
, nitt=NITT*MULT, burnin=BURN*MULT, thin = THIN*MULT,
verbose=FALSE)
b) Model summary
baitIntake zi_baitIntake baitIntake:animals zi_baitIntake:
animals
Lag 0 1.000000000 1.00000000 1.000000000
1.00000000
Lag 9000 0.017938814 0.02981790 0.006431397
0.10481115
Lag 45000 -0.006022882 -0.03825681 -0.009933080
0.07381003
Lag 90000 0.005797923 -0.02302556 0.008195491
0.01082001
Lag 450000 -0.011994835 -0.06558056 -0.060988406
-0.04143047
baitIntake:tourists zi_baitIntake:tourists baitIntake:operators
Lag 0 1.000000000 1.00000000 1.0000000000
Lag 9000 0.013111341 0.12011385 0.0017622903
Lag 45000 -0.006995284 0.02981600 -0.0484513234
Lag 90000 -0.015289329 0.01696414 0.0001681332
Lag 450000 -0.064725489 0.01717594 0.0335091484
zi_baitIntake:operators presence
Lag 0 1.000000000 1.000000000
Lag 9000 0.164230628 -0.007679376
Lag 45000 -0.008430448 -0.020076515
Lag 90000 0.006900147 -0.020648916
Lag 450000 -0.016715936 -0.011170416
autocorr.diag(model7.0$VCV)
baitIntake.id zi_baitIntake.id baitIntake.event
zi_baitIntake.event
Lag 0 1.00000000 1.00000000 1.00000000 1.0000000000
Lag 9000 -0.02161770 0.13751550 0.01040345 0.3762189835
Lag 45000 0.01833775 -0.02772668 -0.04564059 0.0273746776
Lag 90000 0.02075539 0.01042728 0.02602572 0.0171497956
Lag 450000 -0.02695780 -0.04212894 -0.02191762 0.0009894756
baitIntake.units zi_baitIntake.units
Lag 0 1.000000000 NaN
Lag 9000 0.011209318 NaN
Lag 45000 -0.013619054 NaN
Lag 90000 0.030874665 NaN
Lag 450000 0.002752061 NaN
However, when I compared the plots to the plots from model 6.0 where I used
(NITT-BURN)/THIN = 1500, the plots of model 6.0 show better mixing. Do you
think that one of these two models would be more suitable than the other to
continue working with?
On a different note: the prior question at the beginning of the message
might not be appropriate for the list as I realize it is a rather basic
question. I tried to find some more resources online to get a better
understanding about choosing and defining appropriate priors in the future.
I found some webpages and forum entries but wanted to ask if you have
resources you would recommend in regards to better understanding priors and
prior choice?
Again, thank you very much for all the help! I appreciate it a lot.
Kind regards,
Vital
--
*Vital Heim *
*PhD student*
University of Basel, Switzerland
Bimini Biological Field Station Foundation
Bimini, Bahamas
+41 (0)79 732 05 57
vital.heim at gmail.com
I wanted to double-check if I understand the prior suggestion correctly.
While I do understand the basics of the priors to some extent, I still have
difficulties to look at a model and say which prior would be appropriate. I
thought that the probability density distribution of the prior is flatter
the smaller nu is, i.e. the smaller nu, the weaker the prior. By
recommending to make the prior stronger, do you mean to increase nu? If so,
I have nu=1000 in the G part of the prior formula, which I thought was a
very strong prior already? Or did you mean to increase nu in the R-part of
the prior specification?
I think the prior is OK as it is.
I read somewhere that the ratio of (NITT-BURN)/THIN should be kept between
1000-2000. So, my question is, if as long as this ratio is between
1000-2000 is it ok to run the model for very long ? or is that trying too
hard to fit the model?
That's not the best kind of advice in my opinion. The number of iterations is a bit meaningless without accounting for auto-correlation, which is why effective sample size was invented. Simply decide on a minimal target for effective sample size, depending on the kind of precision you would like for your estimates.
I ran a model with NITT = 920000, BURN = 20000, THIN = 600, i.e.
(NITT-BURN)/THIN = 1500. I attached the details below and added you the
trace and density plots (I removed them for the list email as it would make
the message too large). The ZI-related variance in the baitIntake is
slightly larger again (12 compared to 9 in the previous model where I
excluded some animals). The effective sample sizes are larger than 200 for
all ZI-related variances but still rather small for zi_baitIntake.id (412)
and zi_baitIntake.event (303). The autocorrelation is also rather large for
some of the ZI-related variances.
If you are still unsatisfied by the effective sample size, you simply need to run the model for longer.
However, I also read some more publications this morning that used MCMCglmm
and looked at their nitt, burn and thin parameters. A lot of them result in
(NITT-BURN)/THIN = 1000. I therefore re-run my model with parameters that
also had 1000 as a ratio.
Again, this is somewhat meaningless as it depends on your autocorrelation level, which in turns depends on both the model and the data. If anything, I believe you should save much more than 1000 iterations if you want to increase your effective sample size without having to run your MCMC for eons. The way you set up everything with this MULT variable might not be the best way forward.
On a different note: the prior question at the beginning of the message
might not be appropriate for the list as I realize it is a rather basic
question. I tried to find some more resources online to get a better
understanding about choosing and defining appropriate priors in the future.
I found some webpages and forum entries but wanted to ask if you have
resources you would recommend in regards to better understanding priors and
prior choice?