Skip to content

The three routines in R that calculate the wilcoxon signed-rank test give different p-values.......which is correct?

3 messages · Michael G Rupert, Peter Ehlers, Peter Dalgaard

#
On 2011-04-12 16:57, Michael G Rupert wrote:
Ahem .... that's a pretty strong claim.

Actually, the problem is user misunderstanding and the relevant
help pages do tell you where the differences lie.

Let's take the 3 functions one at a time, using your
x,y data from Pratt:

1. wilcox.test() in the stats package
This function automatically switches to using a Normal
approximation when there are ties in the data:

  wilcox.test(x, y, paired=TRUE)$p.value
#[1] 0.05802402
(You can suppress the warning (due to ties) by specifying
the argument 'exact=FALSE'.)

This function also uses a continuity correction unless
told not to:

  wilcox.test(x, y, paired=TRUE, correct=FALSE)$p.value
#[1] 0.05061243

2. wilcox.exact() in pkg exactRankTests
This function can handle ties (using the "Wilcoxon" method)
with an 'exact' calculation:

  wilcox.exact(x, y, paired=TRUE)$p.value
#[1] 0.0546875

If you want the Normal approximation:

  wilcox.exact(x, y, paired=TRUE, exact=FALSE)$p.value
#[1] 0.05061243  <-- cf. above

3. wilcoxsign_test() in pkg coin
This is the most comprehensive of these functions.
It is also the only one that offers the "Pratt" method
of handling ties. It will default to this method and
a Normal approximation:

  pvalue(wilcoxsign_test(x ~ y))
#[1] 0.08143996

  pvalue(wilcoxsign_test(x ~ y, zero.method="Pratt",
         distribution="asympt"))
#[1] 0.08143996

You can get the results from wilcox.exact() with

  pvalue(wilcoxsign_test(x ~ y, zero.method="Wilcoxon",
         distribution="asympt"))
#[1] 0.05061243

and

  pvalue(wilcoxsign_test(x ~ y, zero.method="Wilcoxon",
         dist="exact"))
#[1] 0.0546875

As to which method you should use, that's up to you.

Peter Ehlers

The
#
On Apr 13, 2011, at 01:57 , Michael G Rupert wrote:

            
Well, there are two version of zero-handling, and for each of these, you can have exact p values or asymptotic p values with or without continuity correction, so that's 6 possibilities already.
They do if you turn off the continuity correction in wilcox.test:
Wilcoxon signed rank test

data:  x and y 
V = 39, p-value = 0.05061
alternative hypothesis: true location shift is not equal to 0
So one does continuity correction and the other not.
They still handle zeros differently. wilcox.exact does not handle the Pratt ranking.

To get exact p values for Pratt ranks, try
1-sample Permutation Test

data:  c(-3, -4, -5, 6:11) 
T = 51, p-value = 0.08984
alternative hypothesis: true mu is not equal to 0 


... and for the asymptotic counterpart:
Asymptotic 1-sample Permutation Test

data:  c(-3, -4, -5, 6:11) 
T = 51, p-value = 0.08144
alternative hypothesis: true mu is not equal to 0
Not found. Apparently, you _constructed_ a data set to get the same set of ranks.
Apparently, the Pratt paper predates the convention that a p value is the probability of observing "the test statistic or more extreme" and he switches back and forth between "less than" and "less than or equal" (to a negative rank sum of 6 and 12 resp.). Also, his p-values are one-sided.

Using modern technology, it is pretty easy to generate the enumerations that Pratt is referring to:
0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 
 1  1  1  2  2  3  4  5  6  8  9 10 12 13 15 17 18 19 21 21 22 23 23 
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 
23 23 22 21 21 19 18 17 15 13 12 10  9  8  6  5  4  3  2  2  1  1  1
0  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 
 1  1  1  1  1  2  2  3  3  4  4  5  6  7  7  8 10 10 11 12 13 13 15 
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 
15 16 16 17 17 18 17 17 18 17 17 16 16 15 15 13 13 12 11 10 10  8  7 
48 49 50 51 52 53 54 55 56 57 58 59 60 63 
 7  6  5  4  4  3  3  2  2  1  1  1  1  1