From: John Fieberg [mailto:John.Fieberg at dnr.state.mn.us]
I have a data set w/ an ordinal response taking on one of 10
categories.
I am considering using polr to fit a cumulative logits model. I
previously fit the model in SAS (using proc logistic) which provides a
test for the proportional odds assumption (p < 0.001 for the
test). Are
there simple diagnostic plots that can be used to look at the validity
of this assumption and possibly help w/ modifying the model as
appropriate? Any references or examples of useful R code for
addressing
the proportional odds assumption would be much appreciated!
I also used a regression tree approach to explore this data set. In
doing so, I treated the response as numeric, using the rpart
library. I
am rather new to regression trees - and wondered about the validity of
this approach. I used cross-validation to prune the tree -
but plots of
the response clearly indicate that the data are non-normal and don't
have equal variance (the data are highly skewed towards
larger response
categories - values of 8-10). I have seen some people
suggest that the
tree approach is essentially non-parametric - but then I have
seen other
references suggesting examination of residual plots and potential
transformations of the response to ensure homogeneity of
variance. For
this data set, it will be difficult to find an appropriate
transformation, given the large number of responses near 10 (i.e., the
fact that the data are constrained to be less than or equal to 10
results in strange residual plots).