Skip to content
Prev 386925 / 398502 Next

Different results on running Wilcoxon Rank Sum test in R and SPSS

Dear Professor John,?
Thank you very much for your reply!?
I agree with you that the non-parametric tests I mentioned in my previous email (Moods median test and Median test) do not make sense in this situation as they treat PFD_n and drug_code as different groups. As you correctly said, I want to use PFD_n as a vector of scores and drug_code to make two groups out of it. This is exactly what the Independent samples median test does in SPSS. I wish to perform the same test in R and am unable to do so.
Simply put, I am asking how to perform the Independent samples median test in R just like it is performed in SPSS??

Secondly, for the question you are asking about the test statistic, I have not performed the Wilcoxon Rank sum test in SPSS for the PFD_n and drug_code data. I have said something to the contrary in my first email, I apologize for that.?
Thank you very much for your time!?
Yours sincerelyBharat Rawlley On Wednesday, 20 January, 2021, 04:47:21 am IST, John Fox <jfox at mcmaster.ca> wrote:
Dear Bharat Rawlley,

What you tried to do appears to be nonsense. That is, you're treating 
PFD_n and drug_code as if they were scores for two different groups.

I assume that what you really want to do is to treat PFD_n as a vector 
of scores and drug_code as defining two groups. If that's correct, and 
with your data into Data, you can try the following:

------snip ------

 > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE)

??? Wilcoxon rank sum test with continuity correction

data:? PFD_n by drug_code
W = 197, p-value = 0.05563
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
? -2.000014e+00? 5.037654e-05
sample estimates:
difference in location
? ? ? ? ? ? ? -1.000019

Warning messages:
1: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26,? :
? cannot compute exact p-value with ties
2: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26,? :
? cannot compute exact confidence intervals with ties

------snip ------

You can get an approximate confidence interval by specifying exact=FALSE:

------snip ------

 > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE, exact=FALSE)

??? Wilcoxon rank sum test with continuity correction

data:? PFD_n by drug_code
W = 197, p-value = 0.05563
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
? -2.000014e+00? 5.037654e-05
sample estimates:
difference in location
? ? ? ? ? ? ? -1.000019

------snip ------

As it turns out, your data are highly discrete and have a lot of ties 
(see in particular PFD_n = 28):

------snip ------

 > xtabs(~ PFD_n + drug_code, data=Data)

? ? ? drug_code
PFD_n? 0? 1
? ? 0? 2? 0
? ? 16? 1? 1
? ? 18? 0? 1
? ? 19? 0? 1
? ? 20? 2? 0
? ? 22? 0? 1
? ? 24? 2? 0
? ? 25? 1? 2
? ? 26? 5? 2
? ? 27? 4? 2
? ? 28? 5 13
? ? 30? 1? 2

------snip ------

I'm no expert in nonparametric inference, but I doubt whether the 
approximate p-value will be very accurate for data like these.

I don't know why wilcox.test() (correctly used) and SPSS are giving you 
slightly different results -- assuming that you're actually doing the 
same thing in both cases. I couldn't help but notice that most of your 
data are missing. Are you getting the same value of the test statistic 
and different p-values, or is the test statistic different as well?

I hope this helps,
? John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/
On 2021-01-19 5:46 a.m., bharat rawlley via R-help wrote: