R-1.8.0 seems to calculate wrong covariances, when the argument of cov()
is a matrix or a data frame.
The following should produce a matrix of zeroes and NaNs:
x <- matrix(c(NA ,NA ,0.9068995 ,NA ,-0.3116229,
-0.06011117 ,0.7310134 ,NA ,1.738362 ,0.6276125,
0.6615581 ,NA ,NA ,-2.646011 ,-2.126105,
NA ,1.081825 ,NA ,1.253795 ,1.520708,
0.2822814 ,NA ,NA ,NA ,NA,
0.03291028 ,NA ,NA ,NA ,NA,
NA ,NA ,NA ,-0.5462126 ,-0.1997394,
NA ,-0.3419413 ,-0.2675226 ,-1.000133 ,-0.1346234,
NA ,NA ,-0.411743 ,1.301612 ,NA,
0.922197 ,NA ,0.9513522 ,0.2357021 ,NA),
nrow=10, ncol=5)
c1 <- cov(x, use="pairwise.complete")
c2 <- matrix(nrow=5, ncol=5)
for (i in 1:5)
{
for (j in 1:5)
{
c2[i,j] <- cov(x[,i], x[,j], use="pairwise.complete")
}
}
c2-c1
Instead, R-1.8.0 produces this result:
[,1] [,2] [,3] [,4] [,5]
[1,] 0.00000000 -0.03053828 NA -0.0144996353 -0.03485883
[2,] -0.03053828 -0.01649857 NA 0.0137259383 -0.02960707
[3,] NA NA -0.1296134 NA NA
[4,] -0.01449964 0.01372594 NA -0.0003152629 0.08717648
[5,] -0.03485883 -0.02960707 NA 0.0871764791 0.04961190
This happens as well under Linux (Suse 9.1) as well as under Windows NT.
Under 1.9.1 (Linux) and 1.9.0 (Windows) i get the expected matrix of
zeroes and NaNs.
This example is not very special. Under R-1.8.0 cov produced wrong result
for any random matrix i tried.
Doesn't this mean, that *any* result obtained under R 1.8.0 is unreliable?
By the way, i just recompiled R-1.8.0 from source under Linux and tried
'make check'. All tests were ok.
Does there exist a more detailed set of tests, which could insure that
at least the most basic R functions work correctly?
Christian
Covariance bug in R-1.8.0
2 messages · lederer@trium.de, Peter Dalgaard
lederer at trium.de writes:
R-1.8.0 seems to calculate wrong covariances, when the argument of cov() is a matrix or a data frame. The following should produce a matrix of zeroes and NaNs:
...
Under 1.9.1 (Linux) and 1.9.0 (Windows) i get the expected matrix of zeroes and NaNs. This example is not very special. Under R-1.8.0 cov produced wrong result for any random matrix i tried.
Presumably, this is the same as PR#4646.
Doesn't this mean, that *any* result obtained under R 1.8.0 is unreliable?
It means that covariances and correlations are sometimes computed incorrectly.
By the way, i just recompiled R-1.8.0 from source under Linux and tried 'make check'. All tests were ok.
Yes. We don't release versions that don't pass their own tests.
Does there exist a more detailed set of tests, which could insure that at least the most basic R functions work correctly?
We add regression tests as we discover and fix bugs. We can't fix old versions retroactively though, we release patch versions (e.g. 1.8.1) instead.
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907