Expressing a multinomial GLM as a series of binomial GLMs
Dear Christoph,
If I understand correctly what you've done, the two approaches are not equivalent and should not in general produce the same fitted probabilities.
Letting {a, b} represent logit(a vs. b) = log(Pr(a)/Pr(b)) and {ab, cd} represent logit(a or b vs. c or d), and numbering the response levels 1, 2, 3, 4, then the multinomial logit model fits the logits {2, 1}, {3, 1}, {4, 1}, while your binary logit models are for the logits {12, 34}, {23, 14}, {34, 12}. Note that the first and third are complementary, but even if you had used three distinct logits of this kind, the combined models (which BTW wouldn't be independent) would not be equivalent to the multinomial logit model.
I hope that this helps (and that I've not misconstrued what you did).
Best,
John
------------------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/
On Tue, 22 Jul 2014 16:47:17 +0200
"Scherber, Christoph" <cscherb1 at gwdg.de> wrote:
Dear all,
I am trying to express a multinomial GLM (using nnet) as a series of GLM models.
However, when I compare the multinom() predictions to those from GLM, I see differences that I can?t
explain. Can anyone help me out here?
Here comes a reproducible example:
##
# set up data: (don?t care what they are, just for playing)
set.seed(0)
cats=c("oligolectic","polylectic","specialist","generalist")
explan1=c("natural","managed")
explan2=c("meadow","meadow","pasture","pasture")
multicats=factor(sample(cats,replace=T,100,prob=c(0.5,0.2,0.1,0.5)))
multiplan1=factor(rep(explan1,50))
multiplan2=factor(rep(explan2,25))
########################
library(nnet)
m2=multinom(multicats~multiplan1)
# predictions from multinomial model
predict(m2,type="probs")
########################
# now set up contrasts for response variable "multicats" (which has 4 levels):
ii=as.numeric(multicats)
g1=glm(I(ii%in%c(1,2)) ~ multiplan1, family = "binomial")
g2=glm(I(ii%in%c(2,3)) ~ multiplan1, family = "binomial")
g3=glm(I(ii%in%c(3,4)) ~ multiplan1, family = "binomial")
r1=predict(g1,type="response")
r2=predict(g2,type="response")
r3=predict(g3,type="response")
# calculate predictions (based on Chapter 8.3 in Dobson 2002, Introduction to GLMs)
ee0=1/(1+r1+r2+r3)
ee1=r1/(1+r1)
ee2=r2/(1+r1+r2)
ee3=r3/(1+r1+r2+r3)
# compare predictions between GLM and multinom fits:
apply(cbind(ee0,ee1,ee2,ee3),2,mean)
apply(predict(m2,type="probs"),2,mean)
#################
[using R 3.1.1 on Windows 7 32-bit]
--
PD Dr. Christoph Scherber
Senior Lecturer
DNPW, Agroecology
University of Goettingen
Grisebachstrasse 6
37077 Goettingen
Germany
telephone +49 551 39 8807
facsimile +49 551 39 8806
www.gwdg.de/~cscherb1
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.