parallel MCMCglmm, RNGstreams, starting values & priors

Hi,

You need to make sure that they are fixed at the same value, not just fixed.

Cheers,

Jarrod

Quoting Ruben Arslan <rubenarslan at gmail.com> on Fri, 29 Aug 2014  
11:07:21 +0200:
Hi Jarrod,

that was exactly it. I hadn't checked the $VCV mcmclist, but I'll do  
so in the future
as the mistake is blindingly obvious that way: http://imgur.com/nlA0QwZ

After adding fix=2 to R1 in my starting values, my parallel chains  
converged as well.

For future amateurs reading this:
prior <- list(
	R=list(V=diag(2), nu=1.002, fix=2),
	G=list(G1=list(V=diag(2), nu=1, alpha.mu=c(0,0), alpha.V=diag(2)*1000))
)

start <- list(
	liab=c(rnorm( nrow(krmh.1)*2 )),
	R = list(R1 = rIW(diag(2), 10 , fix = 2)), ### FIX=2 HERE AS WELL
	G = list(G1 = rIW(diag(2), 10 ))
)

Thank you so much. This sort of remote diagnosis with this little information
seems borderline psychic to me.
Now, onwards to modelling the pedigrees :-)

Best wishes,

Ruben

PS.: I never actually figured out what kind of variability to use  
for rIW() in my starting values,
but it doesn't seem to matter and that's what I should want I  
suppose. Actually, even with the mis-specified
starting values my posterior looked pretty similar to now.

On 29 Aug 2014, at 09:13, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:

Hi Ruben,

Actually I might know what it is. When you sample different  
starting values do you inadvertently sample a new residual variance  
for the unidentified za part? You need to make sure that this is  
always fixed at the same value (otherwise the model is different).  
This is not a problem under the trait:units specification.

Cheers,

Jarrod.

Quoting Jarrod Hadfield <j.hadfield at ed.ac.uk> on Fri, 29 Aug 2014  
08:04:42 +0100:

Hi Ruben,

Can you share your data and I will take a look. Its definitely not  
Monte Carlo error.

Cheers,

Jarrod

Quoting Ruben Arslan <rubenarslan at gmail.com> on Thu, 28 Aug 2014  
22:44:47 +0200:

Hi Jarrod,

those two matched up quite well yes. I just completed another 20  
chains, using more variable
starting values. There's still two fixed effects  
traitza_children:spouses  and :male which haven't converged
according to multi-chain (gelman), but have according to geweke.
The offending traces: http://imgur.com/Qm6Ovfr
These specific effects aren't of interest to me, so if this  
doesn't affect the rest of my estimates, I can be happy
with this, but I can't conclude that, can I?

I'm now also doing a run to see how it deals with the more  
intensely zero-inflated data when including
the unmarried.

Thanks a lot for all that help,

Ruben

gelman.diag(mcmclist)
Potential scale reduction factors:

                                         Point est. Upper C.I.
(Intercept)                                      1.00       1.00
traitza_children                                 1.40       1.65
male                                             1.00       1.00
spouses                                          1.00       1.00
paternalage.mean                                 1.00       1.00
paternalage.factor(25,30]                        1.00       1.00
paternalage.factor(30,35]                        1.00       1.00
paternalage.factor(35,40]                        1.00       1.00
paternalage.factor(40,45]                        1.00       1.00
paternalage.factor(45,50]                        1.00       1.00
paternalage.factor(50,55]                        1.00       1.00
paternalage.factor(55,90]                        1.00       1.00
traitza_children:male                            1.33       1.54
traitza_children:spouses                         2.21       2.83
traitza_children:paternalage.mean                1.01       1.02
traitza_children:paternalage.factor(25,30]       1.05       1.08
traitza_children:paternalage.factor(30,35]       1.08       1.13
traitza_children:paternalage.factor(35,40]       1.15       1.25
traitza_children:paternalage.factor(40,45]       1.15       1.26
traitza_children:paternalage.factor(45,50]       1.26       1.43
traitza_children:paternalage.factor(50,55]       1.15       1.25
traitza_children:paternalage.factor(55,90]       1.14       1.23

Multivariate psrf

8.99

summary(mcmclist)
Iterations = 100001:149951
Thinning interval = 50
Number of chains = 20
Sample size per chain = 1000

1. Empirical mean and standard deviation for each variable,
 plus standard error of the mean:

                                             Mean      SD  Naive  
SE Time-series SE
(Intercept)                                 1.36326 0.04848  
0.0003428      0.0003542
traitza_children                           -0.76679 0.28738  
0.0020321      0.0016682
male                                        0.09980 0.01633  
0.0001155      0.0001222
spouses                                     0.12333 0.01957  
0.0001384      0.0001414
paternalage.mean                            0.07215 0.02194  
0.0001551      0.0001596
paternalage.factor(25,30]                  -0.03381 0.04184  
0.0002959      0.0003066
paternalage.factor(30,35]                  -0.08380 0.04270  
0.0003019      0.0003118
paternalage.factor(35,40]                  -0.16502 0.04569  
0.0003231      0.0003289
paternalage.factor(40,45]                  -0.16738 0.05090  
0.0003599      0.0003697
paternalage.factor(45,50]                  -0.18383 0.05880  
0.0004158      0.0004242
paternalage.factor(50,55]                  -0.18241 0.07277  
0.0005146      0.0005302
paternalage.factor(55,90]                  -0.40612 0.09875  
0.0006983      0.0007467
traitza_children:male                       0.12092 0.08223  
0.0005815      0.0004697
traitza_children:spouses                    0.64881 0.21132  
0.0014942      0.0008511
traitza_children:paternalage.mean          -0.02741 0.08550  
0.0006046      0.0006221
traitza_children:paternalage.factor(25,30] -0.17296 0.18680  
0.0013209      0.0013750
traitza_children:paternalage.factor(30,35] -0.19027 0.19267  
0.0013624      0.0013901
traitza_children:paternalage.factor(35,40] -0.24911 0.21282  
0.0015049      0.0014391
traitza_children:paternalage.factor(40,45] -0.29772 0.23403  
0.0016548      0.0015956
traitza_children:paternalage.factor(45,50] -0.51782 0.28589  
0.0020215      0.0017602
traitza_children:paternalage.factor(50,55] -0.46126 0.32064  
0.0022673      0.0021397
traitza_children:paternalage.factor(55,90] -0.38612 0.41461  
0.0029317      0.0027396

2. Quantiles for each variable:

                                             2.5%      25%       
50%      75%      97.5%
(Intercept)                                 1.26883  1.33106   
1.36322  1.39575  1.4589722
traitza_children                           -1.20696 -0.95751  
-0.81076 -0.63308 -0.0365042
male                                        0.06785  0.08878   
0.09970  0.11085  0.1320168
spouses                                     0.08467  0.11030   
0.12343  0.13643  0.1617869
paternalage.mean                            0.02950  0.05751   
0.07202  0.08683  0.1153881
paternalage.factor(25,30]                  -0.11581 -0.06174  
-0.03397 -0.00574  0.0473783
paternalage.factor(30,35]                  -0.16656 -0.11250  
-0.08358 -0.05519  0.0003065
paternalage.factor(35,40]                  -0.25518 -0.19530  
-0.16500 -0.13440 -0.0757366
paternalage.factor(40,45]                  -0.26887 -0.20164  
-0.16675 -0.13335 -0.0677407
paternalage.factor(45,50]                  -0.30080 -0.22320  
-0.18339 -0.14440 -0.0687967
paternalage.factor(50,55]                  -0.32663 -0.23034  
-0.18227 -0.13317 -0.0415547
paternalage.factor(55,90]                  -0.60202 -0.47303  
-0.40454 -0.33994 -0.2139128
traitza_children:male                      -0.01083  0.06634   
0.11024  0.16109  0.3295892
traitza_children:spouses                    0.37857  0.51072   
0.59398  0.71395  1.2127940
traitza_children:paternalage.mean          -0.19138 -0.08250  
-0.02985  0.02493  0.1468989
traitza_children:paternalage.factor(25,30] -0.57457 -0.28481  
-0.16489 -0.05151  0.1728148
traitza_children:paternalage.factor(30,35] -0.61499 -0.30350  
-0.17736 -0.06299  0.1555147
traitza_children:paternalage.factor(35,40] -0.74251 -0.36752  
-0.22966 -0.10777  0.1151897
traitza_children:paternalage.factor(40,45] -0.84165 -0.42691  
-0.27729 -0.14322  0.1032436
traitza_children:paternalage.factor(45,50] -1.21782 -0.66568  
-0.48420 -0.32873 -0.0476720
traitza_children:paternalage.factor(50,55] -1.21327 -0.63623  
-0.43432 -0.24957  0.0955360
traitza_children:paternalage.factor(55,90] -1.33772 -0.62227  
-0.35364 -0.11050  0.3361684

effectiveSize(mcmclist)
                             (Intercept)                           
 traitza_children
                                18814.05                           
         16359.33
                                    male                           
          spouses
                                18132.98                           
         19547.05
                        paternalage.mean                   
paternalage.factor(25,30]
                                19238.72                           
         18974.81
               paternalage.factor(30,35]                   
paternalage.factor(35,40]
                                18874.33                           
         19406.63
               paternalage.factor(40,45]                   
paternalage.factor(45,50]
                                19075.18                           
         19401.77
               paternalage.factor(50,55]                   
paternalage.factor(55,90]
                                18960.11                           
         17893.23
                   traitza_children:male                    
traitza_children:spouses
                                18545.55                           
         14438.51
       traitza_children:paternalage.mean  
traitza_children:paternalage.factor(25,30]
                                18464.09                           
         16943.43
traitza_children:paternalage.factor(30,35]  
traitza_children:paternalage.factor(35,40]
                                16827.44                           
         17230.04
traitza_children:paternalage.factor(40,45]  
traitza_children:paternalage.factor(45,50]
                                17144.78                           
         18191.67
traitza_children:paternalage.factor(50,55]  
traitza_children:paternalage.factor(55,90]
                                17466.60                           
         18540.59

### current script:

# bsub -q mpi -W 24:00 -n 21 -R np20 mpirun -H localhost -n 21 R  
--slave -f "/usr/users/rarslan/rpqa/krmh_main/children.R"
setwd("/usr/users/rarslan/rpqa/")
library(doMPI)
cl <-  
startMPIcluster(verbose=T,workdir="/usr/users/rarslan/rpqa/krmh_main/")
registerDoMPI(cl)
Children = foreach(i=1:clusterSize(cl),.options.mpi =  
list(seed=1337) ) %dopar% {
	library(MCMCglmm);library(data.table)
  setwd("/usr/users/rarslan/rpqa/krmh_main/")
	source("../1 - extraction functions.r")
  load("../krmh1.rdata")

	krmh.1 = recenter.pat(na.omit(krmh.1[spouses>0, list(idParents,  
children, male, spouses, paternalage)]))

	samples = 1000
	thin = 50; burnin = 100000
	nitt = samples * thin + burnin

	prior <- list(
		R=list(V=diag(2), nu=1.002, fix=2),
		G=list(G1=list(V=diag(2), nu=1, alpha.mu=c(0,0), alpha.V=diag(2)*1000))
	)

	start <- list(
		liab=c(rnorm( nrow(krmh.1)*2 )),
		R = list(R1 = rIW(diag(2), 10 )),
		G = list(G1 = rIW(diag(2), 10 ))
	)

	( m1 = MCMCglmm( children ~ trait * (male + spouses +  
paternalage.mean + paternalage.factor),
						rcov=~idh(trait):units,
						random=~idh(trait):idParents,
						family="zapoisson",
						start = start,
						prior = prior,
						data=krmh.1,
						pr = F, saveX = F, saveZ = F,
						nitt=nitt,thin=thin,burnin=burnin)
	)
		m1$Residual$nrt<-2
	m1
}

save(Children,file = "Children.rdata")
closeCluster(cl)
mpi.quit()

On 28 Aug 2014, at 20:59, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:

Hi,

The posteriors for the two models look pretty close to me. Are  
the scale reduction factors really as high as previously  
reported? Before you had 1.83 for traitza_children:spouses, but  
the plot suggests that it should be close to 1?

Cheers,

Jarrod

Quoting Ruben Arslan <rubenarslan at gmail.com> on Thu, 28 Aug 2014  
19:59:16 +0200:

Sure! Thanks a lot.
I am using ~idh(trait):units already, sorry for saying that  
incorrectly in my last email.
These models aren't the final thing, I will replace the  
paternalage.factor variable
with its linear equivalent if that seems defensible (does so  
far) and in this model it seems
okay to remove the za-effects for all predictors except spouses.
So a final model would have fewer fixed effects. I also have  
datasets of 200k+ and 5m+,
but I'm learning MCMCglmm with this smaller one because my  
wrong turns take less time.

I've uploaded a comparison coef plot of two models:
http://i.imgur.com/sHUfnmd.png
m7 is with the default starting values, m1 is with the  
specification I sent in my last email. I don't
know if such differences are something to worry about.

I don't know what qualifies as highly overdispersed, here's a  
plot of the outcome for ever
married people (slate=real data):
http://imgur.com/14MywgZ
here's with everybody born (incl. some stillborn etc.):
http://imgur.com/knRGa1v
I guess my approach (generating an overdispersed poisson with  
the parameters from
the data and checking if it has as excess zeroes) is not the  
best way to diagnose zero-inflation,
but especially in the second case it seems fairly clear-cut.

Best regards,

Ruben

summary(m1)
Iterations = 50001:149951
Thinning interval  = 50
Sample size  = 2000

DIC: 31249.73

G-structure:  ~idh(trait):idParents

                   post.mean  l-95% CI u-95% CI eff.samp
children.idParents     0.006611 4.312e-08   0.0159    523.9
za_children.idParents  0.193788 7.306e-02   0.3283    369.3

R-structure:  ~idh(trait):units

               post.mean l-95% CI u-95% CI eff.samp
children.units       0.1285   0.1118   0.1452    716.1
za_children.units    0.9950   0.9950   0.9950      0.0

Location effects: children ~ trait * (male + spouses +  
paternalage.mean + paternalage.factor)

                                         post.mean   l-95% CI    
u-95% CI eff.samp  pMCMC
(Intercept)                                 1.3413364   
1.2402100  1.4326099     1789 <5e-04 ***
traitza_children                           -0.8362879  
-1.2007980 -0.5016730     1669 <5e-04 ***
male                                        0.0994902   
0.0679050  0.1297394     2000 <5e-04 ***
spouses                                     0.1236033   
0.0839000  0.1624939     2000 <5e-04 ***
paternalage.mean                            0.0533892   
0.0119569  0.0933960     2000  0.015 *
paternalage.factor(25,30]                  -0.0275822  
-0.1116421  0.0537359     1842  0.515
paternalage.factor(30,35]                  -0.0691025  
-0.1463214  0.0122393     1871  0.097 .
paternalage.factor(35,40]                  -0.1419933  
-0.2277379 -0.0574678     1845 <5e-04 ***
paternalage.factor(40,45]                  -0.1364952  
-0.2362714 -0.0451874     1835  0.007 **
paternalage.factor(45,50]                  -0.1445342  
-0.2591767 -0.0421178     1693  0.008 **
paternalage.factor(50,55]                  -0.1302972  
-0.2642965  0.0077061     2000  0.064 .
paternalage.factor(55,90]                  -0.3407879  
-0.5168972 -0.1493652     1810 <5e-04 ***
traitza_children:male                       0.0926888  
-0.0147379  0.2006142     1901  0.098 .
traitza_children:spouses                    0.5531197   
0.3870616  0.7314289     1495 <5e-04 ***
traitza_children:paternalage.mean           0.0051463  
-0.1279396  0.1460099     1617  0.960
traitza_children:paternalage.factor(25,30] -0.1538957  
-0.4445749  0.1462955     1781  0.321
traitza_children:paternalage.factor(30,35] -0.1747883  
-0.4757851  0.1162476     1998  0.261
traitza_children:paternalage.factor(35,40] -0.2261843  
-0.5464379  0.0892582     1755  0.166
traitza_children:paternalage.factor(40,45] -0.2807543  
-0.6079678  0.0650281     1721  0.100 .
traitza_children:paternalage.factor(45,50] -0.4905843  
-0.8649214 -0.1244174     1735  0.010 **
traitza_children:paternalage.factor(50,55] -0.4648579  
-0.9215759 -0.0002083     1687  0.054 .
traitza_children:paternalage.factor(55,90] -0.3945406  
-1.0230155  0.2481568     1793  0.195
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

describe(krmh.1[spouses>0,])
                 vars    n mean   sd median trimmed  mad   min   
 max range skew kurtosis   se
children               2 6829 3.81 2.93   4.00    3.61 2.97   
0.00 16.00 16.00 0.47    -0.46 0.04
male                   3 6829 0.46 0.50   0.00    0.45 0.00   
0.00  1.00  1.00 0.14    -1.98 0.01
spouses                4 6829 1.14 0.38   1.00    1.03 0.00   
1.00  4.00  3.00 2.87     8.23 0.00
paternalage            5 6829 3.65 0.80   3.57    3.60 0.80   
1.83  7.95  6.12 0.69     0.70 0.01
paternalage_c          6 6829 0.00 0.80  -0.08   -0.05 0.80  
-1.82  4.30  6.12 0.69     0.70 0.01
paternalage.mean       7 6829 0.00 0.68  -0.08   -0.05 0.59  
-1.74  4.30  6.04 0.95     1.97 0.01
paternalage.diff       8 6829 0.00 0.42   0.00   -0.01 0.38  
-1.51  1.48  2.99 0.17     0.17 0.01

table(krmh.1$paternalage.factor)
[0,25] (25,30] (30,35] (35,40] (40,45] (45,50] (50,55] (55,90]
 309    1214    1683    1562    1039     623     269     130

On 28 Aug 2014, at 19:05, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:

Hi Ruben,

It might be hard to detect (near) ECPs with so many fixed  
effects (can you post the model summary (and give us the mean  
and standard deviation of any continuous covariates)). Also,  
the complementary log-log link (which is the za specification)  
is non-symmetric and runs into problems outside the range -35  
to 3.5 so there may be a problem there, particularly if you  
use rcov=~trait:units and the Poisson part is highly  
over-dispersed.  You could try rcov=~idh(trait):units and fix  
the non-identifiable za residual variance to something smaller  
than 1 (say 0.5)  - it will mix slower but it will reduce the  
chance of over/underflow.

Cheers,

Jarrod

Quoting Ruben Arslan <rubenarslan at gmail.com> on Thu, 28 Aug  
2014 18:45:30 +0200:

Hi Jarrod,

1) it did not return an error with rcov = ~trait:units  
because you used R1=rpois(2,1)+1 and yet this specification  
only fits a single variance (not a 2x2 covariance matrix).  
R1=rpois(2,1)+1 is a bit of a weird specification since it  
has to be integer. I would obtain starting values using rIW().
I agree it's a weird specification, I was a bit lost and  
thought I could get away with just putting some random  
numbers in the starting value.
I didn't do R1=rpois(2,1)+1 though, I did  
R1=diag(rpois(2,1)+1), so I got a 2x2 matrix, but yes, bound  
to be integer.
I didn't know starting values should come from a conjugate  
distribution, though that probably means I didn't think about  
it much.

I'm now doing
start <- list(
	liab=c(rnorm( nrow(krmh.1)*2 )),
	R = list(R1 = rIW( diag(2), nrow(krmh.1)) ),
	G = list(G1 = rIW( diag(2), nrow(krmh.1)) )
)

Is this what you had in mind?
I am especially unsure if I am supposed to use such a low  
sampling variability (my sample size is probably not even  
relevant for the starting values) and if I should start from  
diag(2).

And, I am still happily confused that this specification  
still doesn't lead to errors with respect to rcov =  
~trait:units . Does this mean I'm doing it wrong?

3) a) how many effective samples do you have for each  
parameter? and b) are you getting extreme category  
problems/numerical issues? If you store the latent variables  
(pl=TUE) what is their range for the Zi/za part?
My parallel run using the above starting values isn't finished yet.
a) After applying the above starting values I get, for the  
location effects 1600-2000 samples for a 2000 sample chain  
(with thin set to 50). G and R-structure are from 369  
(za_children.idParents) to 716 (and 0 for the fixed part).
Effective sample sizes were similar for my run using the  
starting values for G/R that I drew from rpois, and using 40  
chains I of course get
b) I don't think I am getting extreme categories. I would  
probably be getting extreme categories if I included the  
forever-alones (they almost never have children), but this  
way no.
I wasn't sure how to examine the range of the latents  
separately for the za part, but for a single chain it looks  
okay:
quantile(as.numeric(m1$Liab),probs = c(0,0.01,0,0.99,1))
   0%        1%        0%       99%      100%
-4.934111 -1.290728 -4.934111  3.389847  7.484206

Well, all considered now that I use the above starting value  
specification I get slightly different estimates for all  
za-coefficients. Nothing major, but still leading me to
think my estimates aren't exactly independent of the starting  
values I use. I'll see what the parallel run yields.

Thanks a lot,

Ruben

Cheers,

Jarrod

Quoting Ruben Arslan <rubenarslan at gmail.com> on Wed, 27 Aug  
2014 19:23:42 +0200:

Hi Jarrod,

thanks again. I was able to get it running with your advice.
Some points of confusion remain:

- You wrote that zi/za models would return an error with  
rcov = ~trait:units + starting values. This did not happen  
in my case, so I didn't build MCMCglmm myself with your  
suggested edits. Also, have you considered putting your own  
MCMCglmm repo on Github? Your users would be able to  
install pre-releases and I'd think you'd get some  
time-saving pull requests too.
- In my attempts to get my models to run properly, I messed  
up a prior and did not use fix=2 in my prior specification  
for my za models. This led to crappy convergence, it's much  
better now and for some of my simpler models I think I  
won't need parallel chains. I'm reminded of Gelman's folk  
theorem of statistical computing.
- I followed your advice, but of course I could not set the  
true values as starting values, but wanted to set random,  
bad starting values. I pasted below what I arrived at, I'm  
especially unsure whether I specified the starting values  
for G and R properly (I think not).
	start <- list(
		liab=c(rnorm( nrow(krmh.1)*2 )),
		R = list(R1 = diag(rpois(2, 1)+1)),
		G = list(G1 = diag(rpois(2, 1)+1))
	)

However, even though I may not need multiple chains for  
some of my simpler models, I've now run into conflicting  
diagnostics. The geweke.diag for each chain (and  
examination of the traces) gives
satisfactory diagnostics. Comparing multiple chains using  
gelman.diag, however, leads to one bad guy, namely the  
traitza_children:spouses interaction.
I think this implies that I've got some starting value  
dependence for this parameter, that won't be easily  
rectified through longer burnin?
Do you have any ideas how to rectify this?
I am currently doing sequential analyses on episodes of  
selection and in historical human data only those who marry  
have a chance at having kids. I exclude the unmarried
from my sample where I predict number of children, because  
I examine that in a previous model and the zero-inflation  
(65% zeros, median w/o unmarried = 4) when including the  
unmarried is so excessive.
Number of spouses is easily the strongest predictor in the  
model, but only serves as a covariate here. Since my other  
estimates are stable across chains and runs and agree well  
across models and with theory, I'm
inclined to shrug this off. But probably I shouldn't ignore  
this sign of non-convergence?

gelman.diag(mcmc_1)
Potential scale reduction factors:

                                      Point est. Upper C.I.
(Intercept)                                      1.00       1.00
traitza_children                                 1.27       1.39
male                                             1.00       1.00
spouses                                          1.00       1.00
paternalage.mean                                 1.00       1.00
paternalage.factor(25,30]                        1.00       1.00
paternalage.factor(30,35]                        1.00       1.00
paternalage.factor(35,40]                        1.00       1.00
paternalage.factor(40,45]                        1.00       1.00
paternalage.factor(45,50]                        1.00       1.00
paternalage.factor(50,55]                        1.00       1.00
paternalage.factor(55,90]                        1.00       1.00
traitza_children:male                            1.22       1.32
traitza_children:spouses                         1.83       2.13
traitza_children:paternalage.mean                1.02       1.02
traitza_children:paternalage.factor(25,30]       1.03       1.05
traitza_children:paternalage.factor(30,35]       1.05       1.08
traitza_children:paternalage.factor(35,40]       1.10       1.15
traitza_children:paternalage.factor(40,45]       1.12       1.17
traitza_children:paternalage.factor(45,50]       1.19       1.28
traitza_children:paternalage.factor(50,55]       1.12       1.18
traitza_children:paternalage.factor(55,90]       1.11       1.17

Multivariate psrf

7.27

Best regards,

Ruben

On 26 Aug 2014, at 13:04, Jarrod Hadfield  
<j.hadfield at ed.ac.uk> wrote:

Hi Ruben,

There are 400 liabilities in a zapoisson model (2 per  
datum). This code should work:

g <-sample(letters[1:10], size = 200, replace = T)
pred <- rnorm(200)

l1<-rnorm(200, -1, sqrt(1))
l2<-rnorm(200, -1, sqrt(1))

y<-VGAM::rzapois(200, exp(l1), exp(-exp(l2)))

# generate zero-altered data with an intercept of -1  
(because the intercept and variance are the same for both  
processes this is just standard Poisson)

dat<-data.frame(y=y, g = g, pred = pred)

start.1<-list(liab=c(l1,l2), R = list(R1=diag(2)),  
G=list(G1=diag(2)))
prior.1<-list(R=list(V=diag(2), nu=1.002, fix=2),  
G=list(G1=list(V=diag(2), nu=2, alpha.mu=c(0,0),  
alpha.V=diag(2)*1000)))

m1<-MCMCglmm(y~trait + pred:trait, random=~us(trait):g,  
family="zapoisson",rcov=~idh(trait):units, data=dat,  
prior=prior.1, start= start.1)

However, there are 2 bugs in the current version of  
MCMCglmm that return an error message when the  
documentation implies it should be fine:

a) it should be possible to have R=diag(2) rather than R =  
list(R1=diag(2)). This bug cropped up when I implemented  
block-diagonal R structures. It can be fixed by inserting:

    if(!is.list(start$R)){
       start$R<-list(R1=start$R)
    }

on L514 of MCMCglmm.R below

    if(!is.list(prior$R[[1]])){
       prior$R<-list(R1=prior$R)
    }

b) rcov=~trait:units models for zi/za models will return  
an error when passing starting values. To fix this insert

   if(diagR==3){
     if(dim(start)[1]!=1){
       stop("V is the wrong dimension for some  
strart$G/start$R elements")
     }
     start<-diag(sum(nfl))*start[1]
   }

on L90 of priorfromat.R below

   if(is.matrix(start)==FALSE){
     start<-as.matrix(start)
   }

I will put these in the new version.

Cheers,

Jarrod

Quoting Ruben Arslan <rubenarslan at gmail.com> on Mon, 25  
Aug 2014 21:52:30 +0200:

Hi Jarrod,

thanks for these pointers.

You will need to provide over-dispersed starting values  
for multiple-chain convergence diagnostics to be useful  
(GLMM are so simple I am generally happy if the output  
of a single run looks reasonable).
Oh, I would be happy with single chains, but since  
computation would take weeks this way, I wanted to  
parallelise and I would use the multi-chain convergence  
as a criterion that my parallelisation was proper
and is as informative as a single long chain. There don't  
seem to be any such checks built-in ? I was analysing my  
40 chains for a bit longer than I like to admit until I  
noticed they were identical (effectiveSize
and summary.mcmc.list did not yell at me for this).

# use some very bad starting values
I get that these values are bad, but that is the goal for  
my multi-chain aim, right?

I can apply this to my zero-truncated model, but am again  
getting stuck with the zero-altered one.
Maybe I need only specify the Liab values for this?
At least I'm getting nowhere with specifying R and G  
starting values here. When I got an error, I always
went to the MCMCglmm source to understand why the checks  
failed, but I didn't always understand
what was being checked and couldn't get it to work.

Here's a failing example:

l<-rnorm(200, -1, sqrt(1))
t<-(-log(1-runif(200)*(1-exp(-exp(l)))))
g = sample(letters[1:10], size = 200, replace = T)
pred = rnorm(200)
y<-rpois(200,exp(l)-t)
y[1:40] = 0
# generate zero-altered data with an intercept of -1

dat<-data.frame(y=y, g = g, pred = pred)
set.seed(1)
start_true = list(Liab=l, R = 1, G = 1 )
m1<-MCMCglmm(y~1 + pred,random = ~ g,  
family="zapoisson",rcov=~us(trait):units, data=dat,  
start= start_true)

# use true latent variable as starting values
set.seed(1)
# use some very bad starting values
start_rand = list(Liab=rnorm(200), R = rpois(1, 1)+1, G =  
rpois(1, 1)+1 )
m2<-MCMCglmm(y~1 + pred,random = ~  
g,rcov=~us(trait):units,  family="zapoisson", data=dat,  
start = start_rand)

Best,

Ruben

On 25 Aug 2014, at 18:29, Jarrod Hadfield  
<j.hadfield at ed.ac.uk> wrote:

Hi Ruben,

Sorry  - I was wrong when I said that everything is  
Gibbs sampled conditional on the latent variables. The  
location effects (fixed and random effects) are also  
sampled conditional on the (co)variance components so  
you should add them to the starting values. In the case  
where the true values are used:

m1<-MCMCglmm(y~1, family="ztpoisson", data=dat,  
start=list(Liab=l,R=1))

Cheers,

Jarrod

Quoting Jarrod Hadfield <j.hadfield at ed.ac.uk> on Mon, 25  
Aug 2014 17:14:14 +0100:

Hi Ruben,

You will need to provide over-dispersed starting values  
for multiple-chain convergence diagnostics to be useful  
(GLMM are so simple I am generally happy if the output  
of a single run looks reasonable).

With non-Gaussian data everything is Gibbs sampled  
conditional on the latent variables, so you only need  
to pass them:

l<-rnorm(200, -1, sqrt(1))
t<-(-log(1-runif(200)*(1-exp(-exp(l)))))
y<-rpois(200,exp(l)-t)+1
# generate zero-truncated data with an intercept of -1

dat<-data.frame(y=y)
set.seed(1)
m1<-MCMCglmm(y~1, family="ztpoisson", data=dat,  
start=list(Liab=l))
# use true latent variable as starting values
set.seed(1)
m2<-MCMCglmm(y~1, family="ztpoisson", data=dat,  
start=list(Liab=rnorm(200)))
# use some very bad starting values

plot(mcmc.list(m1$Sol, m2$Sol))
# not identical despite the same seed because of  
different starting values but clearly sampling the same  
posterior distribution:

gelman.diag(mcmc.list(m1$Sol, m2$Sol))

Cheers,

Jarrod

Quoting Ruben Arslan <rubenarslan at gmail.com> on Mon, 25  
Aug 2014 18:00:08 +0200:

Dear Jarrod,

thanks for the quick reply. Please, don't waste time  
looking into doMPI ? I am happy that I
get the expected result, when I specify that  
reproducible seed, whyever that may be.
I'm pretty sure that is the deciding factor, because I  
tested it explicitly, I just have no idea
how/why it interacts with the choice of family.

That said, is setting up different RNG streams for my  
workers (now that it works) __sufficient__
so that I get independent chains and can use  
gelman.diag() for convergence diagnostics?
Or should I still tinker with the starting values myself?
I've never found a worked example of supplying  
starting values and am thus a bit lost.

Sorry for sending further questions, I hope someone  
else takes pity while
you're busy with lectures.

Best wishes

Ruben

On 25 Aug 2014, at 17:29, Jarrod Hadfield  
<j.hadfield at ed.ac.uk> wrote:

Hi Ruben,

I do not think the issue is with the starting values,  
because even if the same starting values were used  
the chains would still differ because of the  
randomness in the Markov Chain (if I interpret your  
`identical' test correctly). This just involves a  
call to GetRNGstate() in the C++ code (L 871  
ofMCMCglmm.cc) so I think for some reason  
doMPI/foreach is not doing what you expect. I am not  
familiar with doMPI and am in the middle of writing  
lectures so haven't got time to look into it  
carefully. Outside of the context of doMPI I get the  
behaviour I expect:

l<-rnorm(200, -1, sqrt(1))
t<-(-log(1-runif(200)*(1-exp(-exp(l)))))
y<-rpois(200,exp(l)-t)+1
# generate zero-truncated data with an intercept of -1

dat<-data.frame(y=y)
set.seed(1)
m1<-MCMCglmm(y~1, family="ztpoisson", data=dat)
set.seed(2)
m2<-MCMCglmm(y~1, family="ztpoisson", data=dat)
set.seed(2)
m3<-MCMCglmm(y~1, family="ztpoisson", data=dat)

plot(mcmc.list(m1$Sol, m2$Sol))
# different, as expected
plot(mcmc.list(m2$Sol, m3$Sol))
# the same, as expected

Quoting Ruben Arslan <rubenarslan at gmail.com> on Mon,  
25 Aug 2014 16:58:06 +0200:

Dear list,

sorry for bumping my old post, I hope to elicit a  
response with a more focused question:

When does MCMCglmm automatically start from  
different values when using doMPI/foreach?

I have done some tests with models of varying  
complexity. For example, the script in my last
post (using "zapoisson") yielded 40 identical chains:
identical(mcmclist[1], mcmclist[30])
TRUE

A simpler (?) model (using "ztpoisson" and no  
specified prior), however, yielded different chains
and I could use them to calculate gelman.diag()

Changing my script to the version below, i.e.  
seeding foreach using .options.mpi=list( seed= 1337)
so as to make RNGstreams reproducible (or so I   
thought), led to different chains even for the
"zapoisson" model.

In no case have I (successfully) tried to supplant  
the default of MCMCglmm's "start" argument.
Is starting my models from different RNGsubstreams  
inadequate compared to manipulating
the start argument explicitly? If so, is there any  
worked example of explicit starting value manipulation
in parallel computation?
I've browsed the MCMCglmm source to understand how  
the default starting values are generated,
but didn't find any differences with respect to RNG  
for the two families "ztpoisson" and "zapoisson"
(granted, I did not dig very deep).

Best regards,

Ruben Arslan

# bsub -q mpi -W 12:00 -n 41 -R np20 mpirun -H  
localhost -n 41 R --slave -f  
"/usr/users/rarslan/rpqa/rpqa_main/rpqa_children_parallel.R"

library(doMPI)
cl <-  
startMPIcluster(verbose=T,workdir="/usr/users/rarslan/rpqa/rpqa_main/")
registerDoMPI(cl)
Children_mcmc1 =  
foreach(i=1:clusterSize(cl),.options.mpi =  
list(seed=1337) ) %dopar% {
	library(MCMCglmm);library(data.table)
	load("/usr/users/rarslan/rpqa/rpqa1.rdata")

	nitt = 130000; thin = 100; burnin = 30000
	prior.m5d.2 = list(
		R = list(V = diag(c(1,1)), nu = 0.002),
		G=list(list(V=diag(c(1,1e-6)),nu=0.002))
	)

	rpqa.1 = na.omit(rpqa.1[spouses>0, list(idParents,  
children, male, urban, spouses, paternalage.mean,  
paternalage.factor)])
	(m1 = MCMCglmm( children ~ trait * (male + urban +  
spouses + paternalage.mean + paternalage.factor),
						rcov=~us(trait):units,
						random=~us(trait):idParents,
						family="zapoisson",
						prior = prior.m5d.2,
						data=rpqa.1,
						pr = F, saveX = F, saveZ = F,
						nitt=nitt,thin=thin,burnin=burnin))
}

library(coda)
mcmclist =  
mcmc.list(lapply(Children_mcmc1,FUN=function(x) {  
x$Sol}))
save(Children_mcmc1,mcmclist, file =  
"/usr/users/rarslan/rpqa/rpqa_main/rpqa_mcmc_kids_za.rdata")
closeCluster(cl)
mpi.quit()

On 04 Aug 2014, at 20:25, Ruben Arslan  
<rubenarslan at gmail.com> wrote:

Dear list,

would someone be willing to share her or his  
efforts in parallelising a MCMCglmm analysis?

I had something viable using harvestr that seemed  
to properly initialise
the starting values from different random number  
streams (which is desirable,
as far as I could find out), but I ended up being  
unable to use harvestr, because
it uses an old version of plyr, where  
parallelisation works only for multicore, not for
MPI.

I pasted my working version, that does not do  
anything about starting values or RNG
at the end of this email. I can try to fumble  
further in the dark or try to update harvestr,
but maybe someone has gone through all this already.

I'd also appreciate any tips for elegantly  
post-processing such parallel data, as some of my  
usual
extraction functions and routines are hampered by  
the fact that some coda functions
do not aggregate results over chains. (What I get  
from a single-chain summary in MCMCglmm
is a bit more comprehensive, than what I managed to  
cobble together with my own extraction
functions).

The reason I'm parallelising my analyses is that  
I'm having trouble getting a good effective
sample size for any parameter having to do with the  
many zeroes in my data.
Any pointers are very appreciated, I'm quite  
inexperienced with MCMCglmm.

Best wishes

Ruben

# bsub -q mpi-short -W 2:00 -n 42 -R np20 mpirun -H  
localhost -n 41 R --slave -f  
"rpqa/rpqa_main/rpqa_children_parallel.r"
library(doMPI)
cl <- startMPIcluster()
registerDoMPI(cl)
Children_mcmc1 = foreach(i=1:40) %dopar% {
	library(MCMCglmm)
	load("rpqa1.rdata")

	nitt = 40000; thin = 100; burnin = 10000
	prior = list(
		R = list(V = diag(c(1,1)), nu = 0.002),
		G=list(list(V=diag(c(1,1e-6)),nu=0.002))
	)

	MCMCglmm( children ~ trait -1 +  
at.level(trait,1):male + at.level(trait,1):urban +  
at.level(trait,1):spouses +  
at.level(trait,1):paternalage.mean +  
at.level(trait,1):paternalage.factor,
		rcov=~us(trait):units,
		random=~us(trait):idParents,
		family="zapoisson",
		prior = prior,
		data=rpqa.1,
		pr = F, saveX = T, saveZ = T,
		nitt=nitt,thin=thin,burnin=burnin)
}

library(coda)
mcmclist =  
mcmc.list(lapply(Children_mcmc1,FUN=function(x) {  
x$Sol}))
save(Children_mcmc1,mcmclist, file =  
"rpqa_mcmc_kids_za.rdata")
closeCluster(cl)
mpi.quit()

--
Ruben C. Arslan

Georg August University G?ttingen
Biological Personality Psychology and Psychological  
Assessment
Georg Elias M?ller Institute of Psychology
Go?lerstr. 14
37073 G?ttingen
Germany
Tel.: +49 551 3920704
https://psych.uni-goettingen.de/en/biopers/team/arslan

	[[alternative HTML version deleted]]

--
The University of Edinburgh is a charitable body,  
registered in
Scotland, with registration number SC005336.

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

parallel MCMCglmm, RNGstreams, starting values & priors

Thread (15 messages)