I am trying to calculate a partial correlation and p-values. Unfortunately, the results in R are different than what SPSS gives. Here is an example in R (calculating the partial correlation of x and y, controlling for z1 and z2): x <- c(1,20,14,30,9,4,8) y <- c(5,6,7,9,NA,10,6) z1 <- c(13,8,16,14,26,13,20) z2 <- c(12,NA,2,5,8,16,13) fmx <- lm(x ~ z1 + z2, na.action = na.exclude) fmy <- lm(y ~ z1 + z2, na.action = na.exclude) yres <- resid(fmy) xres <- resid(fmx) cor(xres, yres, use = "p") ct <- cor.test(xres, yres) ct$estimate ct$p.value R give me: r = .65, p = .23 However, SPSS calculates: r = .46, p = .70 I think something may be different with R's handling of missing data, as when I replace the NA's with values, R and SPSS give the same r-values, albeit different p-values still. I am doing pairwise case exclusion in both R and SPSS. Any ideas why I'm getting different values? Is something wrong with my formula in R? Any help would be greatly appreciated. Thanks!
Peter Ehlers wrote:
dadrivr wrote:
The variables have the same length, but with different numbers of missing values (NA). As a result, the residuals calculations (xres & yres) have different lengths, and I cannot compute the correlation between the two (error of incompatible dimensions - see example below). Is there a way, when calculating residuals, to leave the NAs in the residual calculation and output? Thanks! x <- c(1,20,14,NA,9) y <- c(5,6,7,9,10) z <- c(13,NA,16,14,NA) xres <- residuals(lm(x ~ z)) yres <- residuals(lm(y ~ z)) cor(xres, yres) ct <- cor.test(xres, yres) ct$estimate ct$p.value
Well, your example above just uses two points for the x on z regression and that gives zero-residuals. Let's use something a bit more realistic: set.seed(123) x <- rnorm(5) y <- rnorm(5) z <- rnorm(5) x[sample(5,1)] <- NA z[sample(5,1)] <- NA fmx <- lm(x ~ z, na.action = na.exclude) fmy <- lm(y ~ z, na.action = na.exclude) yres <- resid(fmy) xres <- resid(fmx) cor(xres, yres, use = "p") # -0.95978 ct <- cor.test(xres, yres) ct$estimate # -0.95978 ct$p.value # 0.04021994 Check out the 'na.exclude' action in lm(): it preserves the length of the residual vector. Then the 'use=' argument to cor() uses pairwise complete observations. cor.test() will do that automatically. -Peter Ehlers
Ista Zahn wrote:
1) Think about what you did wrong. It doesn't make sense to do correlation/regression with variables of different lengths. You can have missing values in one or more variables, if that's what you mean. Just code them NA. 2) Just add in the predictors, e.g. residuals(lm(y ~ z1 + z2)) -Ista On Wed, Nov 11, 2009 at 10:34 PM, dadrivr <dadrivr at gmail.com> wrote:
Awesome, that's what I was looking for. ? I have two additional questions: (1) What can I do if the variables are of different lengths? (2) How do I update the formula if I want to control for more than one variable. Let's take the following example: x <- c(1,20,14,7,9) y <- c(5,6,7,9,10,11) z <- c(13,27,16,5,4,17,20) a <- c(4,6,7,1) xres <- residuals(lm(x ~ z)) yres <- residuals(lm(y ~ z)) cor(xres, yres) ct <- cor.test(xres, yres) ct$estimate ct$p.value How do I update the above formula to: (1) take into account that the variables are of different lengths? ? I get an error when calculating the residuals. (2) control for z and a (i.e., more than one variable)? Thanks so much for your help. Peter Ehlers wrote:
dadrivr wrote:
I'm trying to write code to calculate partial correlations (along with p-values). ? I'm new to R, and I don't know how to do this. ? I have searched and come across different functions, but I haven't been able to get any of them to work (for example, pcor and pcor.test from the ggm package). In the following example, I am trying to compute the correlation between x and y, while controlling for z (partial correlation): x <- c(1,20,14,7,9) y <- c(5,6,7,9,10) z <- c(13,27,16,5,4) What function can I append to this to find this partial correlation? Many thanks!
I'm not sure what you need, but does this give you what you want: xres <- residuals(lm(x ~ z)) yres <- residuals(lm(y ~ z)) cor(xres, yres) # [1] 0.9778857 or ct <- cor.test(xres, yres) ct$estimate ? # 0.9978857 ct$p.value ? # 0.003934582 ? -Peter Ehlers
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- View this message in context: http://old.nabble.com/Partial-correlations-and-p-values-tp26308463p26312873.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
View this message in context: http://n4.nabble.com/Partial-correlations-and-p-values-tp908641p932659.html Sent from the R help mailing list archive at Nabble.com.