MCMCglmm multinomial logit
Hi Jackson,
If x1 is the fixed effect prediction for the log contrast B-A then
plogis(x1/sqrt(1+v1*c2))
gives the expected probability of being B rather than A after
integrating over the random effects (in your case the residuals under
the assumption that they have unit variance v1=1). You can write it
another way:
Have nu1 = x1+e1
plogis(x1/sqrt(1+v1*c2)) \approx int plogis(nu1)pnorm(e1,0,sqrt(v1))de
and we can see the approximation is pretty good:
x1<-2
c2<-0.345843
v1<-1
plogis(x1/sqrt(1+v1*c2))
integrate(function(e1){plogis(x1+e1)*dnorm(e1,0,sqrt(v1))}, -Inf,Inf)
Obtaining the expected probability of being B rather than A, or C
rather than A, is therefore relatively straightforward.
Obtaining the expected probability of B rather than C is a bit more tricky.
take nu2 = log contrast C-A
nu1-nu2 = log contrast B-C
which is
x1+e1-x2+e2
and is distributed as N(x1-x2, V[1,1]+V[2,2]-2*V[1,2]) where in you case V=R.
We can therefore use the same approximation as before replacing x1
with x1-x2 and v1 with V[1,1]+V[2,2]-2*V[1,2].
If we chose V = IJ then V[1,1] = V[2,2] = V[1,1]+V[2,2]-2*V[1,2] and
so we multiply c2 by the same constant for all contrasts. This was the
main motivation. In fact, it might make more sense to scale IJ by
1/(j-1) rather than 1/j so all contrasts have unit variance, but
anything could be used really as long as you are careful.
Looking at it now, the justification that these results/approximations
extend to the case A versus (B and C) as I suggest in the CourseNotes
seems more tenuous and I will check it out.
Cheers,
Jarrod
Quoting Jackson King <jackson.king722 at gmail.com> on Sun, 26 Aug 2012
09:28:45 -0500:
Hello, I am attempting to fit a multinomial logit model (3 categories) with random effects (across states). I have attempted to follow the advice in the course notes, but am a little uncertain the reason for the priors on the residuals, and how this is used to calculate predicted probabilities. I would like my covariates to predict each outcome -- cov1 predicting option3 versus option1 and cov1 predicting option2 versus option1, rather than a main effect. Here is a simple version of my model j<-length(levels(data$char)) I<-diag<-(j-1) #2x2 identity matrix J=matrix(rep(1, (j-1)^2), c(j-1, j-1)) #2x2 Unit Matrix IJ <- (1/3) * (diag(2) + matrix(1, 2, 2)) #Residual covriance matrix prior = list(R = list(V =IJ, fix = 1), G = list(G1 = list(V = diag(2), nu = 0.002))) #Why can't I use V=diag(2) for prior R? model<- MCMCglmm(char~-1+trait*(cov1), random=~idh(trait):state, rcov = ~us( trait):units, data=data, family = "categorical", prior = prior, verbose = TRUE) I find that the model converges, and the coefficients look similar to maximum likelihood, but then I would like to predict probabilities for being in each category. Typically, I think this is done by plogis(x/sqrt(1+c2)), so why is it necessary to multiply by the delta matrix (course notes p. 97)? Alternatively, if I simply use a 2x2 diagonal matrix for the prior for R, shouldn't I be able to use the same transformation -- plogis(x/sqrt(1+c2)). In short, I am a little confused about the IJ matrix and where it comes from. Is there a quick answer, or another paper that explains this? And (2) is it reasonable to predict probabilities from my model, based on fixed values of cov1, using this simple transformation above... plogis(x/sqrt(1+c2)) Many thanks. Jackson [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.