Strange parametrization in polr
On Thu, 8 Jan 2004, BXC (Bendix Carstensen) wrote:
In Venables \& Ripley 3rd edition (p. 231) the proportional odds model is described as: logit(p<=k) = zeta_k + eta but polr apparently thinks there is a minus in front of eta, as is apprent below. Is this a bug og a feature I have overlooked?
If there is really a bug I would guess that it was in the book rather than the code. This is not an unusual parametrisation for this model. It is the parametrisation that reduces to logistic regression for binary data, and makes the regression coefficients positive when the association is positive. -thomas
Here is the naked code for reproduction, below the results. ------------------------------------------------------------------------ --- version library( MASS ) data( housing ) hnames <- lapply( housing[,-5], levels ) house.plr <- polr( Sat ~ Infl + Type + Cont, data=housing, weights=Freq ) summary( house.plr ) newdat <- expand.grid( hnames[-1] )[1:5,] cbind( newdat, predict( house.plr, newdat, type="probs" ) ) # Baseline probs: diff( c(0,tigol( c(-0.4961,0.6907) ), 1) ) # First level of Infl: diff( c(0,tigol( c(-0.4961,0.6907) + 0.5663922 ), 1) ) # But the change of sign for eta is needed to reproduce the fitted values: # Line 2: diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 ), 1) ) # Line 5: diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 + 0.5723552 ), 1) ) ------------------------------------------------------------------------ --- Here is the resulting output: ------------------------------------------------------------------------ ---
version
_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 1 minor 8.1 year 2003 month 11 day 21 language R
library( MASS ) data( housing ) hnames <- lapply( housing[,-5], levels ) house.plr <- polr( Sat ~ Infl + Type + Cont, data=housing,
weights=Freq )
summary( house.plr )
Re-fitting to get Hessian
Call:
polr(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq)
Coefficients:
Value Std. Error t value
InflMedium 0.5663922 0.10465276 5.412109
InflHigh 1.2888137 0.12715609 10.135682
TypeApartment -0.5723552 0.11923800 -4.800107
TypeAtrium -0.3661908 0.15517331 -2.359882
TypeTerrace -1.0910073 0.15148595 -7.202036
ContHigh 0.3602803 0.09553577 3.771156
Intercepts:
Value Std. Error t value
Low|Medium -0.4961 0.1248 -3.9740
Medium|High 0.6907 0.1255 5.5049
Residual Deviance: 3479.149
AIC: 3495.149
newdat <- expand.grid( hnames[-1] )[1:5,] cbind( newdat, predict( house.plr, newdat, type="probs" ) )
Infl Type Cont Low Medium High 1 Low Tower Low 0.3784485 0.2876755 0.3338760 2 Medium Tower Low 0.2568261 0.2742125 0.4689614 3 High Tower Low 0.1436927 0.2110841 0.6452232 4 Low Apartment Low 0.5190450 0.2605077 0.2204473 5 Medium Apartment Low 0.3798522 0.2875967 0.3325511
# Baseline probs: diff( c(0,tigol( c(-0.4961,0.6907) ), 1) )
[1] 0.3784576 0.2876650 0.3338774
# First level of Infl: diff( c(0,tigol( c(-0.4961,0.6907) + 0.5663922 ), 1) )
[1] 0.5175658 0.2609593 0.2214749
# But the change of sign for eta is needed to reproduce the fitted
values:
# Line 2: diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 ), 1) )
[1] 0.2568335 0.2742035 0.4689630
# Line 5: diff( c(0,tigol( c(-0.4961,0.6907) - 0.5663922 + 0.5723552 ), 1) )
[1] 0.3798613 0.2875862 0.3325525
------------------------------------------------------------------------ ----- ---------------------- Bendix Carstensen Senior Statistician Steno Diabetes Center Niels Steensens Vej 2 DK-2820 Gentofte Denmark tel: +45 44 43 87 38 mob: +45 30 75 87 38 fax: +45 44 43 07 06 bxc at steno.dk www.biostat.ku.dk/~bxc
______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle