Skip to content

Missing Value And cor() function

4 messages · vincent.stoliaroff@socgen.com, Frank E Harrell Jr, Christian Schulz +1 more

#
Hi r lovers!

I 'd like to apply the cor() function to a matrix which have some missing values
As a matter of fact and quite logically indeed it doesn't work
Is there a trick to replace the missing value by the mean of each variable or by any other relevant figures ?
Or should I apply a special derivate of the cor() function, (I don't have any idea if it exists and have some trouble to figure out how it could)
to skip this trouble?
Thanks a lot for any suggestions and help

Vincent





*************************************************************************
Ce message et toutes les pieces jointes (ci-apres le "message") sont
confidentiels et etablis a l'intention exclusive de ses destinataires.
Toute utilisation ou diffusion non autorisee est interdite. 
Tout message electronique est susceptible d'alteration. 
La SOCIETE GENERALE et ses filiales declinent toute responsabilite au 
titre de ce message s'il a ete altere, deforme ou falsifie.
				********
This message and any attachments (the "message") are confidentia... {{dropped}}
#
On Thu, 24 Apr 2003 12:27:50 +0200
vincent.stoliaroff at socgen.com wrote:

            
Even though using pairwise deletion of NAs will result sometimes in a singular correlation matrix, it is better to do that than to replace NAs with constants, which will distort the correlations.

You may want to look at the rcorr function in the Hmisc package, which does pairwise deletion of NAs for Pearson and Spearman correlations.  See http://hesweb1.med.virginia.edu/biostat/s/Hmisc.html
Please do not include such long disclaimers in your messages.

---
Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat
#
replace.na.m <- function (x){
    X<-mean(x,na.rm=TRUE)
    ifelse ( is.na(x)=="TRUE",X,x)
}

apply(data.frame,2,replace.na.m)

another way is using cor() and the option
 use: an optional character string giving a method for computing
          covariances in the presence of missing values.  This must be
          (an abbreviation of) one of the strings `"all.obs"',
          `"complete.obs"' or `"pairwise.complete.obs"'.

perhaps this helps you,
regards,christian


----- Original Message -----
From: <vincent.stoliaroff at socgen.com>
To: <r-help at stat.math.ethz.ch>
Sent: Thursday, April 24, 2003 12:27 PM
Subject: [R] Missing Value And cor() function
values
or by any other relevant figures ?
any idea if it exists and have some trouble to figure out how it could)
{{dropped}}
#
On Thu, 24 Apr 2003 vincent.stoliaroff at socgen.com wrote:

            
The cor() function has two options for handling missing values, which are
described on its help page.

	-thomas