Wilcox Test / Mann Whitney U Test - R-help

Thu, Oct 6, 2011 6:35 AM #

Hello List,

I'm trying to prepare some lecture notes on non parametric methods,
and I can't manually reproduce the results of the wilcox.test function
for ordinal data.

The data I'm using are from David Howell's website, available here

http://www.uvm.edu/~dhowell/StatPages/More_Stuff/OrdinalChisq/OrdinalChiSq.html

If I run the wilcox.test function on the data I get a p-value of
.0407, but when I do it myself I get a p-value of 0.0530.  It's not so
much the jump across 0.05, but the fact that I thought I knew what the
function was doing.

I know from the R help page that there is some controversy about how
exactly to calculate the test statistic, but that's not what is
causing the problem, as I can get the same W value.  Am I calculating
the test statistic incorrectly?

Thanks, sample code below
Sam Stewart

#Ordinal example
dropouts = c(rep(0,25),rep(3,10),rep(2,9),rep(1,13),rep(4,6))
remain = c(rep(0,31),rep(3,2),rep(2,6),rep(1,21),rep(4,3))
tab2 = rbind(table(dropouts),table(remain))
ordTest = wilcox.test(x=dropouts,y=remain,correct=FALSE,exact=FALSE)
cumsum(colSums(tab2))
W = max(c(sum(rank(cbind(dropouts,remain))[1:length(dropouts)]),sum(rank(cbind(dropouts,remain))[-(1:length(dropouts))])))
n1 = length(dropouts)
n2 = length(remain)
testStat = (S-n1*(n1+n2+1)/2)/(sqrt(n1*n2*(n1+n2+1)/12))
2*(1-pnorm(testStat))

Sam Stewart

Thu, Oct 6, 2011 6:56 AM #

So I checked it with the wilcox_test in the coin library, and got the
same result.  That makes me more confident that I made a mistake, but
still doesn't help me find it

d = data.frame(value=c(dropouts,remain),group=c(rep("dropout",length(dropouts)),rep("remain",length(remain))))
wilcox_test(value~group,data=d)

Sam

On Thu, Oct 6, 2011 at 10:35 AM, Sam Stewart <rhelp.stats at gmail.com> wrote:

Sam Stewart

Thu, Oct 6, 2011 7:05 AM #

And I figured it out, sorry to bother the list.

The normal approximation I was using is not accurate in the presence of ties.

Sam

On Thu, Oct 6, 2011 at 10:56 AM, Sam Stewart <rhelp.stats at gmail.com> wrote: