Skip to content
Back to formatted view

Raw Message

Message-ID: <20040409170710.1110910476@slim.kubism.ku.dk>
Date: 2004-04-09T19:07:12Z
From: Uwe Ligges
Subject: Incorrect handling of NA's in cor() (PR#6750)

msa@biostat.mgh.harvard.edu wrote:
> Full_Name: Marek Ancukiewicz
> Version: 1.8.1
> OS: Linux
> Submission from: (NULL) (132.183.12.87)
> 
> 
> Function cor() incorrectly handles missing observation with method="spearman":
> 
> 
>>x <- c(1,2,3,NA,5,6)
>>y <- c(4,NA,2,5,1,3)
>>cor(x,y,use="complete.obs",method="s")
> 
> [1] -0.1428571
> 
>>cor(x[!is.na(x)&!is.na(y)],y[!is.na(x)&!is.na(y)],method="s")
> 
> [1] -0.4
> 
> These two results should be the same.
> 


No! Please read at least the help file, ?cor, before submitting a bug 
report:


"If use is "complete.obs" then missing values are handled by casewise 
deletion. Finally, if use has the value "pairwise.complete.obs" then the 
correlation between each pair of variables is computed using all 
complete pairs of observations on those variables."


Hence
   cor(x, y, use="pairwise.complete.obs", method="s")
is what you expect ...

Uwe Ligges