Skip to content
Prev 6348 / 63421 Next

wilcox.test point estimates perverse (PR#1150)

For tied samples this estimators may be not inside the confidence sets which
is confusing. Example (from ?wilcox.exact, package exactRankTests):
Exact Wilcoxon rank sum test

data:  contr and treat 
W = 9, point prob = 0.019, p-value = 0.0989 
alternative hypothesis: true mu is not equal to 0 
95 percent confidence interval:
 -22   4 
sample estimates:
difference in location 
                    13 		<- when computed with median(diffs)

For compatibility between the 2 versions of the Wilcoxon-Test, 
we use the basic definition: 

  d_1 = sup {d | W(d) > E(W) }
  d_2 = inf {d | W(d) < E(W) }

  Hodges-Lehmann = mean(d1,d2)

(using max and min instead of sup and inf which causes the difference).
However, this may be questionable ...
wilcox.test uses the normal approximation when 

a) the sample sizes are large or
b) ties occur.

computing all differences when a) is not feasible (taking it to C does not
improve `outer(x,y,"-")' significantly). Therefore, we
use uniroot for searching d with W(X - d, Y) = E(W).

In case b) in `wilcox.test' the normal
approximation is used for p-values and confidence intervals, it seems
natural to me to compute the point estimator the way the conf ints are
computed. Additionally median(diffs) may lie outside the confidence set
in this situation (see above).
Because the normal approximation is bizzare for small, tied samples? 
That is what `wilcox.exact' is for. 

Torsten
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._