Bootstrap or Wilcoxons' test?

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090213/82933ce0/attachment-0001.pl>
Charlotta,

I'm not sure what you mean when you say simple linear
regression. From your description you have two groups
of people, for which you recorded contaminant concentration.
Thus, I would think you would do something like a t-test to
compare the mean concentration level. Where does the
regression part come in? What are you regressing?

As for the Wilcoxnin test, it is often thought of as a
nonparametric t-test equivalent. This is only true if the
observations were drawn, from a population with the
same probability distribution. The null hypothesis of
the Wilcoxin test is actually "the observations were
drawn, from the same probability distribution".
Thus if your two samples had say different variances,
there means could be the same, but since the variances
are different, the Wilcoxin could give you a significant result.

Don't know if this all makes sense, but if you have more
questions, please e-mail your data and a more detailed
description of what analysis you used and I'd be happy
to try and help out.

Murray M Cooper, Ph.D.
Richland Statistics
9800 N 24th St
Richland, MI, USA 49083
Mail: richstat at earthlink.net

----- Original Message ----- 
From: "Charlotta Rylander" <zcr at nilu.no>
To: <r-help at r-project.org>
Sent: Friday, February 13, 2009 3:24 AM
Subject: [R] Bootstrap or Wilcoxons' test?
Hi!

I'm comparing the differences in contaminant concentration between 2
different groups of people ( N=36, N=37). When using a simple linear
regression model I found no differences between groups, but when 
evaluating
the diagnostic plots of the residuals I found my independent variable to
have deviations from normality (even after log transformation). Therefore 
I
have used bootstrap on the regression parameters ( R= 1000 & R=10000) and
this confirms my results , i.e., no differences between groups ( and the
distribution is log-normal). However, when using wilcoxons' rank sum test 
on
the same data set I find differences between groups.

Should I trust the results from bootstrapping or from wilcoxons' test?

Thanks!

Regards

Lotta Rylander

[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

I must disagree with both this general characterization of the  
Wilcoxon test and with the specific example offered. First, we ought  
to spell the author's correctly and then clarify that it is the  
Wilcoxon rank-sum test that is being considered. Next, the WRS test is  
a test for differences in the location parameter of independent  
samples conditional on the samples having been drawn from the same  
distribution. The WRS test would have no discriminatory power for  
samples drawn from the same distribution having equal location  
parameters but only different with respect to unequal dispersion. Look  
at the formula, for Pete's sake. It summarizes differences in ranking,  
so it is in fact designed NOT to be sensitive to the spread of the  
values in the sample. It would have no power, for instance, to test  
the variances of two samples, both with a mean of 0, and one having a  
variance of 1 with the other having a variance of 3.  One can think of  
the WRS as a test for unequal medians.
David Winsemius, MD. MPH
Heritage Laboratories

On Feb 13, 2009, at 7:48 PM, Murray Cooper wrote:

> Charlotta,
>
> I'm not sure what you mean when you say simple linear
> regression. From your description you have two groups
> of people, for which you recorded contaminant concentration.
> Thus, I would think you would do something like a t-test to
> compare the mean concentration level. Where does the
> regression part come in? What are you regressing?
>
> As for the Wilcoxnin test, it is often thought of as a
> nonparametric t-test equivalent. This is only true if the
> observations were drawn, from a population with the
> same probability distribution. The null hypothesis of
> the Wilcoxin test is actually "the observations were
> drawn, from the same probability distribution".
> Thus if your two samples had say different variances,
> there means could be the same, but since the variances
> are different, the Wilcoxin could give you a significant result.
>
> Don't know if this all makes sense, but if you have more
> questions, please e-mail your data and a more detailed
> description of what analysis you used and I'd be happy
> to try and help out.
>
> Murray M Cooper, Ph.D.
> Richland Statistics
> 9800 N 24th St
> Richland, MI, USA 49083
> Mail: richstat at earthlink.net
>
> ----- Original Message ----- From: "Charlotta Rylander" <zcr at nilu.no>
> To: <r-help at r-project.org>
> Sent: Friday, February 13, 2009 3:24 AM
> Subject: [R] Bootstrap or Wilcoxons' test?
>
>
>> Hi!
>>
>>
>>
>> I'm comparing the differences in contaminant concentration between 2
>> different groups of people ( N=36, N=37). When using a simple linear
>> regression model I found no differences between groups, but when  
>> evaluating
>> the diagnostic plots of the residuals I found my independent  
>> variable to
>> have deviations from normality (even after log transformation).  
>> Therefore I
>> have used bootstrap on the regression parameters ( R= 1000 &  
>> R=10000) and
>> this confirms my results , i.e., no differences between groups  
>> ( and the
>> distribution is log-normal). However, when using wilcoxons' rank  
>> sum test on
>> the same data set I find differences between groups.
>>
>>
>>
>> Should I trust the results from bootstrapping or from wilcoxons'  
>> test?
>>
>>
>>
>> Thanks!
>>
>>
>>
>> Regards
>>
>>
>>
>> Lotta Rylander
>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hi Charlotta, to be more constructive toward your goal. If you bootstrap the
regression when the regression is ill-specified, the bootstrap may not help
you. Further, a test as "difficult" as a regression does not seem to be
necessary in your case. A t-test if your dependent variable is
(approxiamately) normal for both groups and if variances are equal or a
Wilcoxon test if your dependent variable is not normal should do. 

The bootstrap should be very powerful if you do NOT perform it on the
regression (again, bootstrapping the regression may just mean to do the
wrong thing over and over again, which is no improvement). Just bootstrap
sample means for the two groups and compare them appropriately (see:
http://www.stat.berkeley.edu/users/rodwong/Stat131a/boot_diff_twomeans.pdf
). Otherwise, rely on the result of the Wilcoxon test as it is likely more
appropriate if your dependent variable is not normal in the two groups.

Daniel

-------------------------
cuncta stricte discussurus
-------------------------

-----Urspr?ngliche Nachricht-----
Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im
Auftrag von David Winsemius
Gesendet: Friday, February 13, 2009 9:19 PM
An: Murray Cooper
Cc: r-help at r-project.org
Betreff: Re: [R] Bootstrap or Wilcoxons' test?

I must disagree with both this general characterization of the Wilcoxon test
and with the specific example offered. First, we ought to spell the author's
correctly and then clarify that it is the Wilcoxon rank-sum test that is
being considered. Next, the WRS test is a test for differences in the
location parameter of independent samples conditional on the samples having
been drawn from the same distribution. The WRS test would have no
discriminatory power for samples drawn from the same distribution having
equal location parameters but only different with respect to unequal
dispersion. Look at the formula, for Pete's sake. It summarizes differences
in ranking, so it is in fact designed NOT to be sensitive to the spread of
the values in the sample. It would have no power, for instance, to test the
variances of two samples, both with a mean of 0, and one having a variance
of 1 with the other having a variance of 3.  One can think of the WRS as a
test for unequal medians.

--
David Winsemius, MD. MPH
Heritage Laboratories

Charlotta,

I'm not sure what you mean when you say simple linear
regression. From your description you have two groups
of people, for which you recorded contaminant concentration.
Thus, I would think you would do something like a t-test to
compare the mean concentration level. Where does the
regression part come in? What are you regressing?

As for the Wilcoxnin test, it is often thought of as a
nonparametric t-test equivalent. This is only true if the
observations were drawn, from a population with the
same probability distribution. The null hypothesis of
the Wilcoxin test is actually "the observations were
drawn, from the same probability distribution".
Thus if your two samples had say different variances,
there means could be the same, but since the variances
are different, the Wilcoxin could give you a significant result.

Don't know if this all makes sense, but if you have more
questions, please e-mail your data and a more detailed
description of what analysis you used and I'd be happy
to try and help out.

Murray M Cooper, Ph.D.
Richland Statistics
9800 N 24th St
Richland, MI, USA 49083
Mail: richstat at earthlink.net

----- Original Message ----- From: "Charlotta Rylander" <zcr at nilu.no>
To: <r-help at r-project.org>
Sent: Friday, February 13, 2009 3:24 AM
Subject: [R] Bootstrap or Wilcoxons' test?

Hi!

I'm comparing the differences in contaminant concentration between 2
different groups of people ( N=36, N=37). When using a simple linear
regression model I found no differences between groups, but when  
evaluating
the diagnostic plots of the residuals I found my independent  
variable to
have deviations from normality (even after log transformation).  
Therefore I
have used bootstrap on the regression parameters ( R= 1000 &  
R=10000) and
this confirms my results , i.e., no differences between groups  
( and the
distribution is log-normal). However, when using wilcoxons' rank  
sum test on
the same data set I find differences between groups.

Should I trust the results from bootstrapping or from wilcoxons'  
test?

Thanks!

Regards

Lotta Rylander

[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
First of all, sorry for my typing mistakes.

Second, the WRS test is most certainly not a test for unequal medians.
Although under specified models it would be. Just as under specified
models it can be a test for other measures of location. Perhaps I did not
word my explanation correctly, but I did not mean to imply that it would
be a test of equality of variance. It is plain and simple a test for the 
equality
of distributions. When the results of a properly applied parametric test do
not agree with the WRS, it is usually do to a difference in the empirical
density function of the two samples.

Murray M Cooper, Ph.D.
Richland Statistics
9800 N 24th St
Richland, MI, USA 49083
Mail: richstat at earthlink.net

----- Original Message ----- 
From: "David Winsemius" <dwinsemius at comcast.net>
To: "Murray Cooper" <myrmail at earthlink.net>
Cc: "Charlotta Rylander" <zcr at nilu.no>; <r-help at r-project.org>
Sent: Friday, February 13, 2009 9:19 PM
Subject: Re: [R] Bootstrap or Wilcoxons' test?
I must disagree with both this general characterization of the  Wilcoxon 
test and with the specific example offered. First, we ought  to spell the 
author's correctly and then clarify that it is the  Wilcoxon rank-sum test 
that is being considered. Next, the WRS test is  a test for differences in 
the location parameter of independent  samples conditional on the samples 
having been drawn from the same  distribution. The WRS test would have no 
discriminatory power for  samples drawn from the same distribution having 
equal location  parameters but only different with respect to unequal 
dispersion. Look  at the formula, for Pete's sake. It summarizes 
differences in ranking,  so it is in fact designed NOT to be sensitive to 
the spread of the  values in the sample. It would have no power, for 
instance, to test  the variances of two samples, both with a mean of 0, and 
one having a  variance of 1 with the other having a variance of 3.  One can 
think of  the WRS as a test for unequal medians.

-- 
David Winsemius, MD. MPH
Heritage Laboratories

On Feb 13, 2009, at 7:48 PM, Murray Cooper wrote:

Charlotta,

I'm not sure what you mean when you say simple linear
regression. From your description you have two groups
of people, for which you recorded contaminant concentration.
Thus, I would think you would do something like a t-test to
compare the mean concentration level. Where does the
regression part come in? What are you regressing?

As for the Wilcoxnin test, it is often thought of as a
nonparametric t-test equivalent. This is only true if the
observations were drawn, from a population with the
same probability distribution. The null hypothesis of
the Wilcoxin test is actually "the observations were
drawn, from the same probability distribution".
Thus if your two samples had say different variances,
there means could be the same, but since the variances
are different, the Wilcoxin could give you a significant result.

Don't know if this all makes sense, but if you have more
questions, please e-mail your data and a more detailed
description of what analysis you used and I'd be happy
to try and help out.

Murray M Cooper, Ph.D.
Richland Statistics
9800 N 24th St
Richland, MI, USA 49083
Mail: richstat at earthlink.net

----- Original Message ----- From: "Charlotta Rylander" <zcr at nilu.no>
To: <r-help at r-project.org>
Sent: Friday, February 13, 2009 3:24 AM
Subject: [R] Bootstrap or Wilcoxons' test?

Hi!

I'm comparing the differences in contaminant concentration between 2
different groups of people ( N=36, N=37). When using a simple linear
regression model I found no differences between groups, but when 
evaluating
the diagnostic plots of the residuals I found my independent  variable 
to
have deviations from normality (even after log transformation). 
Therefore I
have used bootstrap on the regression parameters ( R= 1000 &  R=10000) 
and
this confirms my results , i.e., no differences between groups  ( and 
the
distribution is log-normal). However, when using wilcoxons' rank  sum 
test on
the same data set I find differences between groups.

Should I trust the results from bootstrapping or from wilcoxons'  test?

Thanks!

Regards

Lotta Rylander

[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

The Wilcoxon rank sum test is not "plain and simple a test equality of  
distributions". If it were such, it would be able to test for  
differences in variance when locations were similar. For that purpose  
it would, in point of fact, be useless. Compare these simple  
situations w.r.t. the WRS:

 > x <- rnorm(100)  # mean=0, sd=1
 > y <- rnorm(100, mean=0, sd=4)
 > wilcox.test(x,y)

	Wilcoxon rank sum test with continuity correction

data:  x and y
W = 4518, p-value = 0.2394
alternative hypothesis: true location shift is not equal to 0

 > y <- rnorm(100, mean=.2, sd=0)
 >
 > wilcox.test(x,y)

	Wilcoxon rank sum test with continuity correction

data:  x and y
W = 3900, p-value = 0.004079
alternative hypothesis: true location shift is not equal to 0

It is a test of the equality of location (and the median is a readily  
understood non-parametric measure of location). The test is derived  
under the *assumption* that the samples are drawn from the *same*  
distribution differing only by a shift. If the distributions were not  
of the same family, the test would be invalidated. The wilcox.test  
help page is informative, saying "the null hypothesis is that the  
distributions of xand y differ by a location shift of mu". The  
pseudomedian is optionally estimated when conf.int is set to TRUE. I  
also suggest looking at the formula for the statistic. It is available  
with getAnywhere(wilcox.test.default).

If one wants a test for "equality of distribution", one could turn to  
a more general test (with loss of power but with at least some  
potential for detecting differences in dispersion) such as the  
Kolmogorov-Smirnov or Kuiper tests. With x and y as above:

 > ks.test(x,y)

	Two-sample Kolmogorov-Smirnov test

data:  x and y
D = 0.61, p-value < 2.2e-16
alternative hypothesis: two-sided

Warning message:
In ks.test(x, y) : cannot compute correct p-values with ties

Returning to the OP's question, rather than worrying about normality  
in samples, the greater threat to validity in regression methods is  
unequal variances across groups or the range of continuous predictors.
David Winsemius

On Feb 13, 2009, at 11:12 PM, Murray Cooper wrote:

> First of all, sorry for my typing mistakes.
>
> Second, the WRS test is most certainly not a test for unequal medians.
> Although under specified models it would be. Just as under specified
> models it can be a test for other measures of location. Perhaps I  
> did not
> word my explanation correctly, but I did not mean to imply that it  
> would
> be a test of equality of variance. It is plain and simple a test for  
> the equality
> of distributions. When the results of a properly applied parametric  
> test do
> not agree with the WRS, it is usually do to a difference in the  
> empirical
> density function of the two samples.
>
> Murray M Cooper, Ph.D.
> Richland Statistics
> 9800 N 24th St
> Richland, MI, USA 49083
> Mail: richstat at earthlink.net
>
> ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net 
> >
> To: "Murray Cooper" <myrmail at earthlink.net>
> Cc: "Charlotta Rylander" <zcr at nilu.no>; <r-help at r-project.org>
> Sent: Friday, February 13, 2009 9:19 PM
> Subject: Re: [R] Bootstrap or Wilcoxons' test?
>
>
>> I must disagree with both this general characterization of the   
>> Wilcoxon test and with the specific example offered. First, we  
>> ought  to spell the author's correctly and then clarify that it is  
>> the  Wilcoxon rank-sum test that is being considered. Next, the WRS  
>> test is  a test for differences in the location parameter of  
>> independent  samples conditional on the samples having been drawn  
>> from the same  distribution. The WRS test would have no  
>> discriminatory power for  samples drawn from the same distribution  
>> having equal location  parameters but only different with respect  
>> to unequal dispersion. Look  at the formula, for Pete's sake. It  
>> summarizes differences in ranking,  so it is in fact designed NOT  
>> to be sensitive to the spread of the  values in the sample. It  
>> would have no power, for instance, to test  the variances of two  
>> samples, both with a mean of 0, and one having a  variance of 1  
>> with the other having a variance of 3.  One can think of  the WRS  
>> as a test for unequal medians.
>>
>> -- 
>> David Winsemius, MD. MPH
>> Heritage Laboratories
>>
>>
>> On Feb 13, 2009, at 7:48 PM, Murray Cooper wrote:
>>
>>> Charlotta,
>>>
>>> I'm not sure what you mean when you say simple linear
>>> regression. From your description you have two groups
>>> of people, for which you recorded contaminant concentration.
>>> Thus, I would think you would do something like a t-test to
>>> compare the mean concentration level. Where does the
>>> regression part come in? What are you regressing?
>>>
>>> As for the Wilcoxnin test, it is often thought of as a
>>> nonparametric t-test equivalent. This is only true if the
>>> observations were drawn, from a population with the
>>> same probability distribution. The null hypothesis of
>>> the Wilcoxin test is actually "the observations were
>>> drawn, from the same probability distribution".
>>> Thus if your two samples had say different variances,
>>> there means could be the same, but since the variances
>>> are different, the Wilcoxin could give you a significant result.
>>>
>>> Don't know if this all makes sense, but if you have more
>>> questions, please e-mail your data and a more detailed
>>> description of what analysis you used and I'd be happy
>>> to try and help out.
>>>
>>> Murray M Cooper, Ph.D.
>>> Richland Statistics
>>> 9800 N 24th St
>>> Richland, MI, USA 49083
>>> Mail: richstat at earthlink.net
>>>
>>> ----- Original Message ----- From: "Charlotta Rylander"  
>>> <zcr at nilu.no>
>>> To: <r-help at r-project.org>
>>> Sent: Friday, February 13, 2009 3:24 AM
>>> Subject: [R] Bootstrap or Wilcoxons' test?
>>>
>>>
>>>> Hi!
>>>>
>>>>
>>>>
>>>> I'm comparing the differences in contaminant concentration  
>>>> between 2
>>>> different groups of people ( N=36, N=37). When using a simple  
>>>> linear
>>>> regression model I found no differences between groups, but when  
>>>> evaluating
>>>> the diagnostic plots of the residuals I found my independent   
>>>> variable to
>>>> have deviations from normality (even after log transformation).  
>>>> Therefore I
>>>> have used bootstrap on the regression parameters ( R= 1000 &   
>>>> R=10000) and
>>>> this confirms my results , i.e., no differences between groups   
>>>> ( and the
>>>> distribution is log-normal). However, when using wilcoxons' rank   
>>>> sum test on
>>>> the same data set I find differences between groups.
>>>>
>>>>
>>>>
>>>> Should I trust the results from bootstrapping or from wilcoxons'   
>>>> test?
>>>>
>>>>
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>> Regards
>>>>
>>>>
>>>>
>>>> Lotta Rylander
>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>

I must disagree with both this general characterization of the Wilcoxon test 
and with the specific example offered. First, we ought to spell the author's 
correctly and then clarify that it is the Wilcoxon rank-sum test that is being 
considered. Next, the WRS test is a test for differences in the location 
parameter of independent samples conditional on the samples having been drawn 
from the same distribution. The WRS test would have no discriminatory power for 
samples drawn from the same distribution having equal location parameters but 
only different with respect to unequal dispersion. Look at the formula, for 
Pete's sake. It summarizes differences in ranking, so it is in fact designed 
NOT to be sensitive to the spread of the values in the sample. It would have no 
power, for instance, to test the variances of two samples, both with a mean of 
0, and one having a variance of 1 with the other having a variance of 3.  One 
can think of the WRS as a test for unequal medians.

One can, and it may be helpful to do so, as long as one knows it isn't actually true. Unfortunately, some text books claim or strongly imply it is true.

To make the test consistent for differences in the median you have to know in advance that the distributions differ only by a location shift, and then it is also consistent for differences in mean (or in any other location parameter).

Also, the operating characteristics aren't particularly similar to a real test for medians, which has pretty low efficiency at the Normal location-shift model (2/pi, IIRC) and is much more sensitive to ties in the data.

And I could go on and on about non-transitivity, but I won't. Anyone who is interested can Google for 'Efron dice'.

        -thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

On Fri, 13 Feb 2009, David Winsemius wrote:

I must disagree with both this general characterization of the  
Wilcoxon test and with the specific example offered. First, we  
ought to spell the author's correctly and then clarify that it is  
the Wilcoxon rank-sum test that is being considered. Next, the WRS  
test is a test for differences in the location parameter of  
independent samples conditional on the samples having been drawn  
from the same distribution. The WRS test would have no  
discriminatory power for samples drawn from the same distribution  
having equal location parameters but only different with respect to  
unequal dispersion. Look at the formula, for Pete's sake. It  
summarizes differences in ranking, so it is in fact designed NOT to  
be sensitive to the spread of the values in the sample. It would  
have no power, for instance, to test the variances of two samples,  
both with a mean of 0, and one having a variance of 1 with the  
other having a variance of 3.  One can think of the WRS as a test  
for unequal medians.

One can, and it may be helpful to do so, as long as one knows it  
isn't actually true. Unfortunately, some text books claim or  
strongly imply it is true.
Yes. I have been corrected on that point before, which was why a chose  
the words I did. Doing a Google search on "derivation wilcoxon rank- 
sum test", the first hit is to a text "Introductory Biostatistics" by  
Le that is an example of such a text ... and many others further down  
the hit list.
To make the test consistent for differences in the median you have  
to know in advance that the distributions differ only by a location  
shift, and then it is also consistent for differences in mean (or in  
any other location parameter).
That is a typical assumption in the derivation of sampling  
distributions of the WRS W-statistic, is it not?

Troendle's article in Statistics and Medicine 18, 2763-2773 (1999)  
(would only be available to subscribers and libraries):
http://www3.interscience.wiley.com.online.uchc.edu/journal/66002289/abstract

An interesting on-line accessible discussion by O'Brien and Castellanoe:
http://www.amstat.org/sections/SRMS/Proceedings/y2005/Files/JSM2005-000930.pdf

Googling also brought up a Univ Of Minn website that has r scripts  
illustrating permutation tests (including WRS) from Hollander and  
Wolfe and a page for the WRS:

http://www.stat.umn.edu/geyer/old/5601/examp/perm.html

http://www.stat.umn.edu/geyer/5601/examp/ranksum.html#test
Also, the operating characteristics aren't particularly similar to a  
real test for medians, which has pretty low efficiency at the Normal  
location-shift model (2/pi, IIRC) and is much more sensitive to ties  
in the data.
My memory from Conover and Iman (only having seen the first edition)  
was that the Pittman efficiency of the WRS in the Gaussian case of  
unequal means was around 85% relative to the t-test. I suppose the  
choice of a central measure for reporting ought to be based on the  
purposes of investigation. If one is planning classification, and the  
distributions were skewed, then the median might be preferable because  
it is less subject to sampling effects:

 > var( apply( sapply(1:500, function(x) rlnorm(20)), 2, median))
[1] 0.08123678
 >
 >
 > var( apply( sapply(1:500, function(x) rlnorm(20)), 2, mean))
[1] 0.2168887

Thank you for the clarification.
David Winsemius

>
>
> And I could go on and on about non-transitivity, but I won't. Anyone  
> who is interested can Google for 'Efron dice'.
>
>       -thomas
>
>
> Thomas Lumley			Assoc. Professor, Biostatistics
> tlumley at u.washington.edu	University of Washington, Seattle
>
>