Date: Mon, 08 Feb 2010 14:23:08 +0100
From: Peter Dalgaard <P.Dalgaard at biostat.ku.dk>
Cc: r-devel at stat.math.ethz.ch, R-bugs at r-project.org
msa at biostat.mgh.harvard.edu wrote:
Full_Name: Marek Ancukiewicz
Version: 2.10.1
OS: Linux
Submission from: (NULL) (74.0.49.2)
Both cor() and cor.test() incorrectly handle ordered variables with
method="kendall", cor() incorrectly handles ordered variables for
method="spearman" (method="person" always works correctly, while
method="spearman" works for cor.test, but not for cor()).
In erroneous calculations these functions ignore the inherent ordering
of the ordered variable (e.g., '9'<'10'<'11') and instead seem to assume
an alphabetic ordering ('10'<'11'<'9').
Strictly speaking, not a bug, since the documentation has
x: a numeric vector, matrix or data frame.
respectively
x, y: numeric vectors of data values. ???x??? and ???y??? must have the
same length.
so noone ever claimed that class "ordered" variables should work.
However, the root cause is that as.vector on a factor variable (ordered
or not) converts it to a character vector, hence
rank(as.vector(as.ordered(9:11)))
[1] 3 1 2
Looks like a simple fix would be to use as.vector(x, "numeric") inside
the definition of cor().
cor(as.ordered(9:11),1:3,method="k")
cor.test(as.ordered(9:11),1:3,method="k")
Kendall's rank correlation tau
data: as.ordered(9:11) and 1:3
T = 1, p-value = 1
alternative hypothesis: true tau is not equal to 0
sample estimates:
tau
-0.3333333
cor(as.ordered(9:11),1:3,method="s")
cor.test(as.ordered(9:11),1:3,method="s")
Spearman's rank correlation rho
data: as.ordered(9:11) and 1:3
S = 0, p-value = 0.3333
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
1