An earlier post had posed the question: "Does anybody know what is relation between 'T' value calculated by 'wilcox_test' function (coin package) and more common 'W' value?" I found the question interesting and ran the commands in R and SPSS. The W reported by R did not seem to correspond to either Mann-Whitney U, Wilcoxon W or the Z which I have more commonly used. Correction for ties may have affected my results. Can anyone else explain what the reported W is and the relation to the reported T? regards bob
Wilcoxon Mann-Whitney Rank Sum Test in R
6 messages · Bob Green, Peter Dalgaard, Torsten Hothorn +1 more
Bob Green <bgreen at dyson.brisnet.org.au> writes:
An earlier post had posed the question: "Does anybody know what is relation between 'T' value calculated by 'wilcox_test' function (coin package) and more common 'W' value?" I found the question interesting and ran the commands in R and SPSS. The W reported by R did not seem to correspond to either Mann-Whitney U, Wilcoxon W or the Z which I have more commonly used. Correction for ties may have affected my results. Can anyone else explain what the reported W is and the relation to the reported T?
Well, it's open source... You could just go check. W is the sum of the ranks in the first group, minus the minimum value it can attain, namely sum(1:n1) == n1*(n1+1)/2. In the tied cases, the actual minimum could be larger. The T would seem to be asymptotically normal
wilcox_test(pd ~ age, data = water_transfer,distribution="asymp")
Asymptotic Wilcoxon Mann-Whitney Rank Sum Test data: pd by groups 12-26 Weeks, At term T = -1.2247, p-value = 0.2207 alternative hypothesis: true mu is not equal to 0
pnorm(-1.2247)*2
[1] 0.2206883 so a good guess at its definition is that it is obtained from W or one of the others by subtracting the mean and dividing with the SD.
O__ ---- Peter Dalgaard ??ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Peter Dalgaard wrote:
Bob Green <bgreen at dyson.brisnet.org.au> writes:
An earlier post had posed the question: "Does anybody know what is relation between 'T' value calculated by 'wilcox_test' function (coin package) and more common 'W' value?" I found the question interesting and ran the commands in R and SPSS. The W reported by R did not seem to correspond to either Mann-Whitney U, Wilcoxon W or the Z which I have more commonly used. Correction for ties may have affected my results. Can anyone else explain what the reported W is and the relation to the reported T?
Well, it's open source... You could just go check. W is the sum of the ranks in the first group, minus the minimum value it can attain, namely sum(1:n1) == n1*(n1+1)/2. In the tied cases, the actual minimum could be larger. The T would seem to be asymptotically normal
wilcox_test(pd ~ age, data = water_transfer,distribution="asymp")
Asymptotic Wilcoxon Mann-Whitney Rank Sum Test
data: pd by groups 12-26 Weeks, At term
T = -1.2247, p-value = 0.2207
alternative hypothesis: true mu is not equal to 0
pnorm(-1.2247)*2
[1] 0.2206883 so a good guess at its definition is that it is obtained from W or one of the others by subtracting the mean and dividing with the SD.
With the SD adjusted for ties, of course. (See, e.g., Conover's book.) Peter Ehlers University of Calgary
P Ehlers <ehlers at math.ucalgary.ca> writes:
so a good guess at its definition is that it is obtained from W or one of the others by subtracting the mean and dividing with the SD.
With the SD adjusted for ties, of course. (See, e.g., Conover's book.)
...which is actually the exact SD, conditional on the set of tied ranks, not just a correction term. See my discussion with Torsten a month or so ago.
O__ ---- Peter Dalgaard ??ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
On Wed, 21 Dec 2005, Peter Dalgaard wrote:
P Ehlers <ehlers at math.ucalgary.ca> writes:
so a good guess at its definition is that it is obtained from W or one of the others by subtracting the mean and dividing with the SD.
With the SD adjusted for ties, of course. (See, e.g., Conover's book.)
...which is actually the exact SD, conditional on the set of tied ranks, not just a correction term. See my discussion with Torsten a month or so ago.
yes, exactly. Thanks, Peter! The `T' values reported by functions in the `coin' package are _standardized_ statistics. Standardization is done utilizing the conditional expectation and conditional variance of the underlying linear statistics as given by Strasser & Weber (1999). Note that _no_ `continuity correction' whatsoever is applied. The limit distribution is normal (or chisq, when the test statistic is a quadratic form). The vignette explains the theoretical framework `coin' maps into software in more detail. It _definitively_ is worse the effort to have a look at it. At first glance it might seem a little bit abstract but after this you'll see how general and powerful the tools are. We are currently working on a manuscript showing more applications, so watch out for the new `coin' version in a few days. Best, Torsten
-- O__ ---- Peter Dalgaard ??ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Peter, You're right, of course, as usual. Sorry about that. Peter E.
Peter Dalgaard wrote:
P Ehlers <ehlers at math.ucalgary.ca> writes:
so a good guess at its definition is that it is obtained from W or one of the others by subtracting the mean and dividing with the SD.
With the SD adjusted for ties, of course. (See, e.g., Conover's book.)
...which is actually the exact SD, conditional on the set of tied ranks, not just a correction term. See my discussion with Torsten a month or so ago.