Incorrect handling of NA's in cor() (PR#6750)
msa@biostat.mgh.harvard.edu wrote:
Full_Name: Marek Ancukiewicz Version: 1.8.1 OS: Linux Submission from: (NULL) (132.183.12.87) Function cor() incorrectly handles missing observation with method="spearman":
x <- c(1,2,3,NA,5,6) y <- c(4,NA,2,5,1,3) cor(x,y,use="complete.obs",method="s")
[1] -0.1428571
cor(x[!is.na(x)&!is.na(y)],y[!is.na(x)&!is.na(y)],method="s")
[1] -0.4 These two results should be the same.
No! Please read at least the help file, ?cor, before submitting a bug report: "If use is "complete.obs" then missing values are handled by casewise deletion. Finally, if use has the value "pairwise.complete.obs" then the correlation between each pair of variables is computed using all complete pairs of observations on those variables." Hence cor(x, y, use="pairwise.complete.obs", method="s") is what you expect ... Uwe Ligges