stratified Wilcoxon available?
On Tue, 30 Aug 2005, Peter Dalgaard wrote:
Torsten Hothorn <Torsten.Hothorn at rzmail.uni-erlangen.de> writes:
On Mon, 29 Aug 2005, Peter Dalgaard wrote:
Torsten Hothorn <Torsten.Hothorn at rzmail.uni-erlangen.de> writes:
Dear All, is there a stratified version of the Wilcoxon test (also known as van Elteren test) available in R?
you can plug it together using the `coin' infrastructure (see the examples in the manual and vignette).
I managed to dig out our old code, and patched it up loosely to fit versions of R later than 0.62 (the trend test code still seems broken). Use at own risk. The usage is fairly straightforward:
SKruskal.test(y~g1|g2)
Kruskal-Wallis stratified rank sum test
data: y , group: g1 , strata: g2
K = 3.1486, df = 3, p-value = 0.3693
the conditional version of the above test can be computed as follows:
R> library("coin")
R> set.seed(290875)
R> mydf <- data.frame(y = rnorm(90), x = gl(3, 30)[sample(1:90)], b = gl(3, 30))
R>
R> ### with global ranks, i.e., ranking without using the blocks
R> kruskal_test(y ~ x | b, data = mydf)
Asymptotic Kruskal-Wallis Test
data: y by groups 1, 2, 3
stratified by b
T = 0.5242, df = 2, p-value = 0.7694
R>
R> ### with _blockwise_ ranking and quadratic form of the test statistic
R> independence_test(y ~ x | b, data = mydf, ytrafo = function(data) trafo(data,
+ numeric_trafo = rank, block = mydf$b), teststat = "quad")
Asymptotic General Independence Test
data: y by groups 1, 2, 3
stratified by b
T = 0.3824, df = 2, p-value = 0.826
R>
R> ### the same
R> SKruskal.test(y ~ x | b, data = mydf)
Kruskal-Wallis stratified rank sum test
data: y , group: x , strata: b
K = 0.3824, df = 2, p-value = 0.826
R>
R>
R>
R> ### trend test (using scores 1, 2, 3)
R> independence_test(y ~ x | b, data = mydf, ytrafo = function(data) trafo(data,
+ numeric_trafo = rank, block = mydf$b), teststat = "quad",
+ scores = list(x = 1:3))
Asymptotic General Independence Test
data: y by groups 1 < 2 < 3
stratified by b
T = 0.0264, df = 1, p-value = 0.871
R>
R> ### hm.
R> SKruskal.test(y ~ x | b, data = mydf, trend = TRUE)
Error in SKruskal.test(y ~ x | b, data = mydf, trend = TRUE) :
subscript out of bounds
Best,
Torsten
Nice to see that the old code made sense.
Nice to see that the new code is working correctly :-)
A bit surprising that it gives _exactly_ the same result as the blockwise ranking in coin...
why? Without ties, the conditional and unconditional versions of the tests should have exactly the same result.
Perhaps introduce some ties? (round(y,1) is usually effective).
this should generate differences, yes.
The trend test is easily fixed: just spell "t value" without capital V as we do nowadays. This gives
SKruskal.test(y ~ x | b, data = mydf, trend = TRUE)
Kruskal-Wallis stratified rank sum trend test
data: y , group: x , strata: b , trend: as.numeric(group)
Z = -0.1624, df = 1, p-value = 0.871
SKruskal.test(y ~ x | b, data = mydf, trend = 1:3)
Kruskal-Wallis stratified rank sum trend test
data: y , group: x , strata: b , trend: 1 2 3
Z = -0.1624, df = 1, p-value = 0.871
(The df=1 is a bit misleading in this case...)
maybe report 0.1624^2 = 0.0264 as test statistic (same as with teststat = "quad" in `coin')? Best, Torsten
-- O__ ---- Peter Dalgaard ??ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907