Skip to content
Prev 11685 / 20628 Next

R-sig-mixed-models Digest, Vol 87, Issue 34

Hi Malcolm,

I've commented in-line ...

Quoting Malcolm Fairbrother <M.Fairbrother at bristol.ac.uk> on Tue, 25  
Mar 2014 02:08:39 +0000:
The threshold variance should not include the probit variance of 1.  
The probit variance is what is specified as the units variance in  
family="threshold":

diff(pnorm(c(-Inf, 0, colMeans(MC2$CP), Inf)-colMeans(MC2$Sol) %*%  
c(1,0,1), 0,sqrt(sum(colMeans(MC2$VCV))))) # for tempcold and contactyes

PS your code is a neater way of getting the probabilities than mine,  
but perhaps it is easier to understand as:

diff(pnorm(c(-Inf, 0, colMeans(MC2$CP), Inf), colMeans(MC2$Sol) %*%  
c(1,0,1),sqrt(sum(colMeans(MC2$VCV))))) # for tempcold and contactyes
In this model we have to decide when predicting whether we want to  
predict for contact=yes or contact=no. We could marginalise this too,  
but this would require knowing something about the distribution of  
contact in the population, and this isn't modelled in this instance.  
Lets say we want to predict for contact=yes as you have attempted. The  
prediction for judge 1 is:

diff(pnorm(c(-Inf, 0, colMeans(MC3$CP),  
Inf)-colMeans(MC3$Sol)[c(1:4,13)] %*% c(1,0,1,1,1), 0,  
sqrt(mean(MC3$VCV[,"units"]))))

because the only random terms left to marginalise are the residuals.

Marginalising the judge effects is a bit trickier in this instance  
because the judge effects for contact=yes are the sum of two terms and  
we need to know the variance of that sum. It is the sum of the  
variances for each effect plus twice the covariance between them.   
With the units variance too this is just equal to the sum of all  
(co)variances (because MCMCglmm returns matrices so gives the  
covariance twice):

diff(pnorm(c(-Inf, 0, colMeans(MC3$CP), Inf)-colMeans(MC3$Sol)[c(1:3)]  
%*% c(1,0,1), 0, sqrt(sum(colMeans(MC3$VCV)))))

In some ways it would be easier (and more instructive in this example)  
to have fitted us(contact) rather than us(1+contact). With the former  
parameterisation the first set of random effects is associated with  
contact=no (as before) but the second set of random effects is  
associated directly with contact=yes (unlike before). Marginalising  
the random effects for contact=yes is then:

diff(pnorm(c(-Inf, 0, colMeans(MC3$CP), Inf)-colMeans(MC3$Sol)[c(1:3)]  
%*% c(1,0,1), 0, sqrt(sum(colMeans(MC3$VCV[,c("units",  
"yes:yes.judge")])))))

and for contact=no is

diff(pnorm(c(-Inf, 0, colMeans(MC3$CP), Inf)-colMeans(MC3$Sol)[c(1:3)]  
%*% c(1,0,1), 0, sqrt(sum(colMeans(MC3$VCV[,c("units",  
"no:no.judge")])))))

The covariances are not required under this formulation for these predictions.

Note that these predictions are for based on the posterior mean of the  
parameters. It would be more satisfying to create predictions for each  
posterior sample of the parameters, and then average the predictions.  
This would give the posterior mean predictions rather than the  
predictions conditional on the posterior mean of the parameters.  
However, in many cases the two things will be close.

Cheers,

Jarrod