Hello List, I'm trying to prepare some lecture notes on non parametric methods, and I can't manually reproduce the results of the wilcox.test function for ordinal data. The data I'm using are from David Howell's website, available here http://www.uvm.edu/~dhowell/StatPages/More_Stuff/OrdinalChisq/OrdinalChiSq.html If I run the wilcox.test function on the data I get a p-value of .0407, but when I do it myself I get a p-value of 0.0530. It's not so much the jump across 0.05, but the fact that I thought I knew what the function was doing. I know from the R help page that there is some controversy about how exactly to calculate the test statistic, but that's not what is causing the problem, as I can get the same W value. Am I calculating the test statistic incorrectly? Thanks, sample code below Sam Stewart #Ordinal example dropouts = c(rep(0,25),rep(3,10),rep(2,9),rep(1,13),rep(4,6)) remain = c(rep(0,31),rep(3,2),rep(2,6),rep(1,21),rep(4,3)) tab2 = rbind(table(dropouts),table(remain)) ordTest = wilcox.test(x=dropouts,y=remain,correct=FALSE,exact=FALSE) cumsum(colSums(tab2)) W = max(c(sum(rank(cbind(dropouts,remain))[1:length(dropouts)]),sum(rank(cbind(dropouts,remain))[-(1:length(dropouts))]))) n1 = length(dropouts) n2 = length(remain) testStat = (S-n1*(n1+n2+1)/2)/(sqrt(n1*n2*(n1+n2+1)/12)) 2*(1-pnorm(testStat))
Wilcox Test / Mann Whitney U Test
3 messages · Sam Stewart
So I checked it with the wilcox_test in the coin library, and got the
same result. That makes me more confident that I made a mistake, but
still doesn't help me find it
d = data.frame(value=c(dropouts,remain),group=c(rep("dropout",length(dropouts)),rep("remain",length(remain))))
wilcox_test(value~group,data=d)
Sam
On Thu, Oct 6, 2011 at 10:35 AM, Sam Stewart <rhelp.stats at gmail.com> wrote:
Hello List, I'm trying to prepare some lecture notes on non parametric methods, and I can't manually reproduce the results of the wilcox.test function for ordinal data. The data I'm using are from David Howell's website, available here http://www.uvm.edu/~dhowell/StatPages/More_Stuff/OrdinalChisq/OrdinalChiSq.html If I run the wilcox.test function on the data I get a p-value of .0407, but when I do it myself I get a p-value of 0.0530. ?It's not so much the jump across 0.05, but the fact that I thought I knew what the function was doing. I know from the R help page that there is some controversy about how exactly to calculate the test statistic, but that's not what is causing the problem, as I can get the same W value. ?Am I calculating the test statistic incorrectly? Thanks, sample code below Sam Stewart #Ordinal example dropouts = c(rep(0,25),rep(3,10),rep(2,9),rep(1,13),rep(4,6)) remain = c(rep(0,31),rep(3,2),rep(2,6),rep(1,21),rep(4,3)) tab2 = rbind(table(dropouts),table(remain)) ordTest = wilcox.test(x=dropouts,y=remain,correct=FALSE,exact=FALSE) cumsum(colSums(tab2)) W = max(c(sum(rank(cbind(dropouts,remain))[1:length(dropouts)]),sum(rank(cbind(dropouts,remain))[-(1:length(dropouts))]))) n1 = length(dropouts) n2 = length(remain) testStat = (S-n1*(n1+n2+1)/2)/(sqrt(n1*n2*(n1+n2+1)/12)) 2*(1-pnorm(testStat))
And I figured it out, sorry to bother the list. The normal approximation I was using is not accurate in the presence of ties. Sam
On Thu, Oct 6, 2011 at 10:56 AM, Sam Stewart <rhelp.stats at gmail.com> wrote:
So I checked it with the wilcox_test in the coin library, and got the
same result. ?That makes me more confident that I made a mistake, but
still doesn't help me find it
d = data.frame(value=c(dropouts,remain),group=c(rep("dropout",length(dropouts)),rep("remain",length(remain))))
wilcox_test(value~group,data=d)
Sam
On Thu, Oct 6, 2011 at 10:35 AM, Sam Stewart <rhelp.stats at gmail.com> wrote:
Hello List, I'm trying to prepare some lecture notes on non parametric methods, and I can't manually reproduce the results of the wilcox.test function for ordinal data. The data I'm using are from David Howell's website, available here http://www.uvm.edu/~dhowell/StatPages/More_Stuff/OrdinalChisq/OrdinalChiSq.html If I run the wilcox.test function on the data I get a p-value of .0407, but when I do it myself I get a p-value of 0.0530. ?It's not so much the jump across 0.05, but the fact that I thought I knew what the function was doing. I know from the R help page that there is some controversy about how exactly to calculate the test statistic, but that's not what is causing the problem, as I can get the same W value. ?Am I calculating the test statistic incorrectly? Thanks, sample code below Sam Stewart #Ordinal example dropouts = c(rep(0,25),rep(3,10),rep(2,9),rep(1,13),rep(4,6)) remain = c(rep(0,31),rep(3,2),rep(2,6),rep(1,21),rep(4,3)) tab2 = rbind(table(dropouts),table(remain)) ordTest = wilcox.test(x=dropouts,y=remain,correct=FALSE,exact=FALSE) cumsum(colSums(tab2)) W = max(c(sum(rank(cbind(dropouts,remain))[1:length(dropouts)]),sum(rank(cbind(dropouts,remain))[-(1:length(dropouts))]))) n1 = length(dropouts) n2 = length(remain) testStat = (S-n1*(n1+n2+1)/2)/(sqrt(n1*n2*(n1+n2+1)/12)) 2*(1-pnorm(testStat))