Sorry to all that are angry about the form of my previous mail. I
didn't realise what would happen :((.
Here it is in (hopefully) plain text (if my mailer doesn't spoil it again):
##############
Dear developers,
I have a problem with some discrepancy between R 1.2.1 for
Windows and R 1.2.2 (and less) for Linux. While trying to correct
the wilcox.test (see my previous bug report) I encountered the
following problem: I wanted to check the behaviour of the function
wdiff used in the search for CI limits.
x<-rnorm(10,3,1)
y<-rnorm(10,0,1)
say
x<-
c(3.770684,4.654342,3.496403,1.772743,1.624953,2.645835,3.0994
77,1.706758,3.507709,1.982924)
y<-c(-0.8161288,0.1632923,0.6421997,1.9270846,-
0.4668112,0.3587806,0.3312529,-0.5393900,
0.1057892,1.7963575)
mu<-0
n.x<-length(x)
n.y<-length(y)
wdiff <- function(d, zq) {
dr <- rank(c(x- mu -d,y))
NTIES.CI <- table(dr)
dz <- (sum(dr[seq(along = x)]) - n.x * (n.x + 1)/2 - n.x * n.y/2)
CORRECTION.CI<-0
SIGMA.CI <- sqrt((n.x * n.y/12) * ((n.x + n.y + 1) -
sum(NTIES.CI^3 - NTIES.CI)/((n.x + n.y) * (n.x + n.y - 1))))
dz <- (dz - CORRECTION.CI)/SIGMA.CI
abs(dz - zq)
}
To examine the behivour I plotted the course of the wdiff for a lot of
d and three zq (0.05, 0.5, 0.95) and I let optimize to compute the
minimums. Then I plotted it.
mumin<-min(x)-max(y)
mumax<-max(x)-min(y)
lll<-seq(mumin,mumax,by=0.01)
wdl<-apply(cbind(lll),1,wdiff,zq=qnorm(0.05))
wdm<-apply(cbind(lll),1,wdiff,zq=qnorm(0.50))
wdu<-apply(cbind(lll),1,wdiff,zq=qnorm(0.95))
plot(lll,wdl,type="l")
lines(lll,wdm,lty=4)
lines(lll,wdu,lty=7)
ol<-optimize(wdiff,c(mumin,mumax),zq=qnorm(0.05))$minimum
om<-optimize(wdiff,c(mumin,mumax),zq=qnorm(0.5))$minimum
ou<-optimize(wdiff,c(mumin,mumax),zq=qnorm(0.95))$minimum
abline(v=ol)
abline(v=om,lty=4)
abline(v=ou,lty=7)
When it ran on my home Linux (R 1.2.2 (2001-02-26) Compiled
from source tarball on Linux with gcc version 2.95.2 19991024) it
computed the minimum for zq = 0.05 only, for the others it found
something totally wrong (three curves with "valleys", one line
indicating the right minimum, one line indicating where the
computed minimum surely doesn't lie; moreover, it was the same
for both 0.5 and 0.95). Then I checked it at work on S+2000 and R
1.2.1 for Windows and it gave the correct results and picture (three
curves with "valleys" and three lines indicating where the minimum
lies). Could you, please, advise me whether the problem stands in
the platform version or in my compiled version, and what should I
do to make it work properly?
Thank you very much for your help.
Marketa Kylouskova
marketa@ucw.cz
kylouskova@euromise.cz
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
PR#896
4 messages · Marketa.Kylouskova@euromise.cz, Peter Dalgaard, Torsten Hothorn
4 days later
Sorry to all that are angry about the form of my previous mail. I didn't realise what would happen :((. Here it is in (hopefully) plain text (if my mailer doesn't spoil it again): ############## Dear developers, I have a problem with some discrepancy between R 1.2.1 for Windows and R 1.2.2 (and less) for Linux. While trying to correct the wilcox.test (see my previous bug report) I encountered the following problem: I wanted to check the behaviour of the function wdiff used in the search for CI limits. x<-rnorm(10,3,1) y<-rnorm(10,0,1) say x<- c(3.770684,4.654342,3.496403,1.772743,1.624953,2.645835,3.0994 77,1.706758,3.507709,1.982924) y<-c(-0.8161288,0.1632923,0.6421997,1.9270846,- 0.4668112,0.3587806,0.3312529,-0.5393900, 0.1057892,1.7963575)
`wdiff' is used for the computation of asymptotic confidence intervals for samples with m,n > 50. I simulated the type I error rate and it works for such large sample sizes. So maybe the problems are due to n.x = n.y = 10 ? Torsten
mu<-0
n.x<-length(x)
n.y<-length(y)
wdiff <- function(d, zq) {
dr <- rank(c(x- mu -d,y))
NTIES.CI <- table(dr)
dz <- (sum(dr[seq(along = x)]) - n.x * (n.x + 1)/2 - n.x * n.y/2)
CORRECTION.CI<-0
SIGMA.CI <- sqrt((n.x * n.y/12) * ((n.x + n.y + 1) -
sum(NTIES.CI^3 - NTIES.CI)/((n.x + n.y) * (n.x + n.y - 1))))
dz <- (dz - CORRECTION.CI)/SIGMA.CI
abs(dz - zq)
}
To examine the behivour I plotted the course of the wdiff for a lot of
d and three zq (0.05, 0.5, 0.95) and I let optimize to compute the
minimums. Then I plotted it.
mumin<-min(x)-max(y)
mumax<-max(x)-min(y)
lll<-seq(mumin,mumax,by=0.01)
wdl<-apply(cbind(lll),1,wdiff,zq=qnorm(0.05))
wdm<-apply(cbind(lll),1,wdiff,zq=qnorm(0.50))
wdu<-apply(cbind(lll),1,wdiff,zq=qnorm(0.95))
plot(lll,wdl,type="l")
lines(lll,wdm,lty=4)
lines(lll,wdu,lty=7)
ol<-optimize(wdiff,c(mumin,mumax),zq=qnorm(0.05))$minimum
om<-optimize(wdiff,c(mumin,mumax),zq=qnorm(0.5))$minimum
ou<-optimize(wdiff,c(mumin,mumax),zq=qnorm(0.95))$minimum
abline(v=ol)
abline(v=om,lty=4)
abline(v=ou,lty=7)
When it ran on my home Linux (R 1.2.2 (2001-02-26) Compiled
from source tarball on Linux with gcc version 2.95.2 19991024) it
computed the minimum for zq = 0.05 only, for the others it found
something totally wrong (three curves with "valleys", one line
indicating the right minimum, one line indicating where the
computed minimum surely doesn't lie; moreover, it was the same
for both 0.5 and 0.95). Then I checked it at work on S+2000 and R
1.2.1 for Windows and it gave the correct results and picture (three
curves with "valleys" and three lines indicating where the minimum
lies). Could you, please, advise me whether the problem stands in
the platform version or in my compiled version, and what should I
do to make it work properly?
Thank you very much for your help.
Marketa Kylouskova
marketa@ucw.cz
kylouskova@euromise.cz
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
7 days later
Torsten Hothorn <Torsten.Hothorn@rzmail.uni-erlangen.de> writes:
c(3.770684,4.654342,3.496403,1.772743,1.624953,2.645835,3.0994 77,1.706758,3.507709,1.982924) y<-c(-0.8161288,0.1632923,0.6421997,1.9270846,- 0.4668112,0.3587806,0.3312529,-0.5393900, 0.1057892,1.7963575)
`wdiff' is used for the computation of asymptotic confidence intervals for samples with m,n > 50. I simulated the type I error rate and it works for such large sample sizes. So maybe the problems are due to n.x = n.y = 10 ? Torsten
I think the real problem is here:
ol<-optimize(wdiff,c(mumin,mumax),zq=qnorm(0.05))$minimum om<-optimize(wdiff,c(mumin,mumax),zq=qnorm(0.5))$minimum ou<-optimize(wdiff,c(mumin,mumax),zq=qnorm(0.95))$minimum abline(v=ol) abline(v=om,lty=4) abline(v=ou,lty=7)
Optimize() is a gradient algorithm and wdiff is not smooth, so the algorithm terminates at a stationary non-optimal point. Using optim() with the default Nelder-Mead instead seemed to work better, although not perfectly. (I played with this before going on Easter holiday, and I'm not going to reconstruct exactly what I did then just now....)
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
1 day later
On 16 Apr 2001, Peter Dalgaard BSA wrote:
Torsten Hothorn <Torsten.Hothorn@rzmail.uni-erlangen.de> writes:
c(3.770684,4.654342,3.496403,1.772743,1.624953,2.645835,3.0994 77,1.706758,3.507709,1.982924) y<-c(-0.8161288,0.1632923,0.6421997,1.9270846,- 0.4668112,0.3587806,0.3312529,-0.5393900, 0.1057892,1.7963575)
`wdiff' is used for the computation of asymptotic confidence intervals for samples with m,n > 50. I simulated the type I error rate and it works for such large sample sizes. So maybe the problems are due to n.x = n.y = 10 ? Torsten
I think the real problem is here:
ol<-optimize(wdiff,c(mumin,mumax),zq=qnorm(0.05))$minimum om<-optimize(wdiff,c(mumin,mumax),zq=qnorm(0.5))$minimum ou<-optimize(wdiff,c(mumin,mumax),zq=qnorm(0.95))$minimum abline(v=ol) abline(v=om,lty=4) abline(v=ou,lty=7)
Optimize() is a gradient algorithm and wdiff is not smooth, so the algorithm terminates at a stationary non-optimal point. Using optim() with the default Nelder-Mead instead seemed to work better, although not perfectly. (I played with this before going on Easter holiday, and I'm not going to reconstruct exactly what I did then just now....)
Yes. We replaced `optimize' by `uniroot' in `wilcox.test' and use `optim' in `ansari.test' now. Torsten
-- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._