a question about "by" and "ddply"
On May 29, 2012, at 6:32 PM, jacaranda tree wrote:
Hi all, I have a data set (df, n=10 for the sake of simplicity here) where I have two continuous variables (age and weight) and I also have a grouping variable (group, with two levels). I want to run correlations for each group separately (kind of similar to "split file" in SPSS). I've been experimenting with different functions, and I was able to do this correctly using ddply function, but output is a little bit difficult to read when I do the cor.test to get all the data with p values, df, and pearson r (see below). I also tried to do it with by function. Although, with by, it shows the data for two groups separately, it seems like it calculates the same r for both groups. Here is my code for both ddply and by, and the output as well. I was wondering if there is a way to display the output better with ddply or run the correlations correctly for each group using by. Thanks in advance,
I would have imagined something along the lines of lapply( split( df, df$group, function(x) cor.test(x[["age"]], x[["weight")] ) ... but without an example it's only a guess.
David > 1.with "ddply" > r<-ddply(df, .(group), summarise, "corr" = cor.test(age, weight, > method = "pearson")) > > Output: > Group corr > 1 1 Inf > 2 1 3 > 3 1 0 > 4 1 1 > 5 1 0 > 6 1 two.sided > 7 1 Pearson's product-moment correlation > 8 1 age and weight > 9 1 1, 1 > 10 2 9.722211 > 11 2 3 > 12 2 0.002311412 > 13 2 0.9844986 > 14 2 0 > 15 2 two.sided > 16 2 Pearson's product-moment correlation > 17 2 age and weight > 18 2 0.7779640, 0.9990233 > > 2. with "by" > r <- by(df, group, FUN = function(x) cor.test(age, weight, method = > "pearson")) > > Output: > Group: 1 > > Pearson's product-moment correlation > > data: age and weight > t = 6.4475, df = 8, p-value = 0.0001988 > alternative hypothesis: true correlation is not equal to 0 > 95 percent confidence interval: > 0.6757758 0.9802100 > sample estimates: > cor > 0.9157592 > > ------------------------------------------------------------ > Group: 2 > > Pearson's product-moment correlation > > data: age and weight > t = 6.4475, df = 8, p-value = 0.0001988 > alternative hypothesis: true correlation is not equal to 0 > 95 percent confidence interval: > 0.6757758 0.9802100 > sample estimates: > cor > 0.9157592 > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.