Hi, I am very confused with constructing the wilcox.test in R.
I have two populations 'original' and 'test'.
I want to know if the 'test' is generally 'lower' than original.
I use alpha of 0.05.
So do I write the function as wilcox.test(original, test, alternative="l")?
or wlcox.test(original, test, alternative = "g")?
or wilcox.test(test, original, alternative="g")?
or wilcox.test(test, original, alternative="l")?
How do I interpret the p-value given my criteria?
Do I reject null when p-value less than 0.05?
or greater than 0.95?
Not a statistics major here so I'm really confused.
Need some help.
Thanks.
On Sun, 1 Nov 2009 00:47:50 -0700 (PDT) jomni <jomni1 at gmail.com> wrote:
J> So do I write the function as wilcox.test(original, test,
J> alternative="l")? or wlcox.test(original, test, alternative = "g")?
J> or wilcox.test(test, original, alternative="g")?
J> or wilcox.test(test, original, alternative="l")?
J> How do I interpret the p-value given my criteria?
J> Do I reject null when p-value less than 0.05?
J> or greater than 0.95?
The interpretation of the p depends on how you have tested the
hypothesis.
J> Not a statistics major here so I'm really confused.
You don't need to be that but please read the documentation and try the
given examples in the documentation.
If you would have typed example(wilcox.test) you would have seen for
example:
wlcx.t> ## Two-sample test.
wlcx.t> ## Hollander & Wolfe (1973), 69f.
wlcx.t> ## Permeability constants of the human chorioamnion (a placental
wlcx.t> ## membrane) at term (x) and between 12 to 26 weeks gestational
wlcx.t> ## age (y). The alternative of interest is greater
permeability
wlcx.t> ## of the human chorioamnion for the term
pregnancy.
wlcx.t> x <- c(0.80, 0.83, 1.89, 1.04, 1.45, 1.38, 1.91,
1.64, 0.73, 1.46)
wlcx.t> y <- c(1.15, 0.88, 0.90, 0.74, 1.21)
wlcx.t> wilcox.test(x, y, alternative = "g") # greater
Wilcoxon rank sum test
data: x and y
W = 35, p-value = 0.1272
alternative hypothesis: true location shift is greater than 0
This I think makes it very easy to interprete. Here it is tested as the
text says whether x is greater than y. So if you want to test the
hypothesis that x is smaller than y so you do
wilcox.test(x,y,alternative="less")
then the lower your p is the higher is the probability that the samples
are different. hence p<0.05 would match your confidence level. Now the
surprising news:
wilcox.test(y,x,alternative="greater")
would work as well!
If you are in doubt create an x and an y where you are sure that x is
smaller than y.
One final remark: if you have ties (several identical values in one
sample) you should use wilcox_test of the coin package.
hth
Stefan
On Sun, 1 Nov 2009 00:47:50 -0700 (PDT) jomni <jomni1 at gmail.com> wrote:
J> So do I write the function as wilcox.test(original, test,
J> alternative="l")? or wlcox.test(original, test, alternative = "g")?
J> or wilcox.test(test, original, alternative="g")?
J> or wilcox.test(test, original, alternative="l")?
J> How do I interpret the p-value given my criteria?
J> Do I reject null when p-value less than 0.05?
J> or greater than 0.95?
The interpretation of the p depends on how you have tested the
hypothesis.
J> Not a statistics major here so I'm really confused.
You don't need to be that but please read the documentation and try the
given examples in the documentation.
Comment 1:
As you point out, one should at least scan the documentation.
Here's a quote from ?wilcox.test:
'the one-sided alternative "greater" is that x is shifted
to the right of y'
That's pretty unambiguous.
If you would have typed example(wilcox.test) you would have seen for
example:
wlcx.t> ## Two-sample test.
wlcx.t> ## Hollander & Wolfe (1973), 69f.
wlcx.t> ## Permeability constants of the human chorioamnion (a placental
wlcx.t> ## membrane) at term (x) and between 12 to 26 weeks gestational
wlcx.t> ## age (y). The alternative of interest is greater
permeability
wlcx.t> ## of the human chorioamnion for the term
pregnancy.
wlcx.t> x <- c(0.80, 0.83, 1.89, 1.04, 1.45, 1.38, 1.91,
1.64, 0.73, 1.46)
wlcx.t> y <- c(1.15, 0.88, 0.90, 0.74, 1.21)
wlcx.t> wilcox.test(x, y, alternative = "g") # greater
Wilcoxon rank sum test
data: x and y
W = 35, p-value = 0.1272
alternative hypothesis: true location shift is greater than 0
This I think makes it very easy to interprete. Here it is tested as the
text says whether x is greater than y. So if you want to test the
hypothesis that x is smaller than y so you do
wilcox.test(x,y,alternative="less")
then the lower your p is the higher is the probability that the samples
are different. hence p<0.05 would match your confidence level. Now the
Comment 2:
I know that you know better, but with p-values it's always
best to be careful with the language. "... the probability
that the _samples_ are different" makes little sense. The
samples _are_ different, period (or why do the test?). The
p-value says something about the distribution from which
the samples are obtained.
Cheers,
Peter Ehlers
surprising news:
wilcox.test(y,x,alternative="greater")
would work as well!
If you are in doubt create an x and an y where you are sure that x is
smaller than y.
One final remark: if you have ties (several identical values in one
sample) you should use wilcox_test of the coin package.
hth
Stefan
Thanks for all the help.
I also checked Dalgaard's R book and the explanation of the wilcox.test is
very clear compared to the example in R documentation. Thanks. :handshake: