X-Original-To: msa@biostat.mgh.harvard.edu
Date: Fri, 9 Apr 2004 10:42:59 -0700 (PDT)
From: Thomas Lumley <tlumley@u.washington.edu>
Cc: r-devel@stat.math.ethz.ch, R-bugs@biostat.ku.dk
On Fri, 9 Apr 2004 msa@biostat.mgh.harvard.edu wrote:
Dear Uwe,
You are wrong. First, I've read the help file before
submitting the report. For two variables,
use="pairwise.complete.obs" and use="complete.obs" should be
equivalent, shouldn't it? Of sourse, the results will be
different when we have more than 2 variables. Second, with the
call you proposed I am also getting incorrect result:
I think it's more complicated than either of you are considering.
For the Pearson correlation everything is straightforward, and
pairwise.complete is the same as complete, which is the same as dropping
the NAs manually.
For the rank correlations the question is when the ranking should be done.
The cor() function ranks the observations and then drops missing values,
the manual approach drops missing values and then ranks.
I'm not convinced that it is obvious which of these is right, though
certainly the help page should document whichever is being done.
-thomas