Multinomial model and p-values
On Jul 2, 2013, at 21:21 , Luciano La Sala wrote:
Hello everyone, I have a dataset which consists of "Pathology scores" (Absent, Mild, Severe) as outcome variable, and two main effects: Age (two factors: twenty / thirty days old) and Treatment Group (four factors: infected without ATB; infected + ATB1; infected + ATB2; infected + ATB3). First I tried to fit an ordinal regression model, which seems more appropriate given the characteristics of my dependent variable (ordinal). However, the assumption of odds proportionality was severely violated (graphically), which prompted me to use a multinomial model instead, using the "nnet" package. First I chose the outcome level that I need to use as baseline category: Data$Path <- relevel(Data$Path, ref = "Absent") Then, I needed to set baseline categories for the independent variables: Data$Age <- relevel(Data$Age, ref = "Twenty") Data$Treat <- relevel(Data$Treat, ref = "infected without ATB") The model: test <- multinom(Path ~ Treat + Age, data = Data) # weights: 18 (10 variable) initial value 128.537638 iter 10 value 80.623608 final value 80.619911 converged
summary1 <- summary(test1) summary1
Call:
multinom(formula = Jej_fact ~ Treat + Age, data = Data)
Coefficients:
(Intercept) infected+ATB1 infected+ATB2 infected+ATB3
AgeThirty
Moderate -2.238106 -1.1738540 -1.709608 -1.599301
2.684677
Severe -1.544361 -0.8696531 -2.991307 -1.506709
1.810771
Std. Errors:
(Intercept) infected+ATB1 infected+ATB2 infected+ATB3
AgeThirty
Moderate 0.7880046 0.8430368 0.7731359 0.7718480
0.8150993
Severe 0.6110903 0.7574311 1.1486203 0.7504781
0.6607360
Residual Deviance: 161.2398
AIC: 181.2398
For a while, I could not find a way to get the p-values for the model and
estimates when using nnet:multinom. Yesterday I came across a post where the
author put forward a similar issue regarding estimation of p-values for
coefficients
(http://stats.stackexchange.com/questions/9715/how-to-set-up-and-estimate-a-
multinomial-logit-model-in-r).
There, one blogger suggested that getting p-values from the summary() result
of multinom() is pretty easy, by first getting the t values as follows:
pt(abs(summary1$coefficients / summary1$standard.errors), df=nrow(Data)-10,
lower=FALSE)
(Intercept) infected+ATB1 infected+ATB2 infected+ATB3
AgeThirty
Moderate 0.002670340 0.08325396 0.014506395 0.02025858
0.0006587898
Severe 0.006433581 0.12665278 0.005216581 0.02352202
0.0035612114
I AM NOT a statistician, so don't be baffled by a silly question! I this
procedure correct?
There's at least a factor of 2 missing for a two-tailed p value. It is usually a mistake to use the t-distribution for what is really a z-statistic; for aggregated data, it can be a very bad mistake. However, it's not really an R question, and you obviously know where to find stackexchange... (local, professional advice would be even better, though.)
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com