Skip to content

correlations between columns for each row

6 messages · Rob Griffin, robgriffin247, Joshua Wiley +1 more

1 day later
#
Just as an update on this problem:
I have managed to get the variance for the selected columns

Now all I need is the covariance between these 2 selections - 
the two target columns are and the aim is that a new column contain a
covariance value between these on each row: 

maindata[,c(174:213)] and maindata[,c(214:253]

I've played around with all sorts of apply (and derivatives of apply) and in
various different setups so I think I'm close but I feel like I'm chasing my
tail here! 

--
View this message in context: http://r.789695.n4.nabble.com/correlations-between-columns-for-each-row-tp4039193p4073208.html
Sent from the R help mailing list archive at Nabble.com.
#
Hi Rob,

Here is one approach:


## define a function that does the calculations
## (the covariance of two vectors divided by the square root of
## the products of their variances is just a correlation)
rF <- function(x, a, b) cor(x[a], x[b], use = "complete.obs")

set.seed(1)
bigdata <- matrix(rnorm(271 * 13890), ncol = 271)

results <- apply(bigdata, 1, FUN = rF, a = 174:213, b = 214:253)

## combine
bigdata <- cbind(bigdata, iecorr = results)

Hope this helps,

Josh

On Tue, Nov 15, 2011 at 8:42 AM, robgriffin247
<robgriffin247 at hotmail.com> wrote:

  
    
#
Error in cor(x[a], x[b], use = "complete.obs") : 'x' must be numeric

This is strange, it works on your example (and you've understood what I'm 
trying to do perfectly), but then when I use it on the original data it 
comes up with the error above
I've checked str() and the columns are all numeric

???

-----Original Message----- 
From: Joshua Wiley
Sent: Tuesday, November 15, 2011 7:14 PM
To: robgriffin247
Cc: r-help at r-project.org
Subject: Re: [R] correlations between columns for each row

Hi Rob,

Here is one approach:


## define a function that does the calculations
## (the covariance of two vectors divided by the square root of
## the products of their variances is just a correlation)
rF <- function(x, a, b) cor(x[a], x[b], use = "complete.obs")

set.seed(1)
bigdata <- matrix(rnorm(271 * 13890), ncol = 271)

results <- apply(bigdata, 1, FUN = rF, a = 174:213, b = 214:253)

## combine
bigdata <- cbind(bigdata, iecorr = results)

Hope this helps,

Josh

On Tue, Nov 15, 2011 at 8:42 AM, robgriffin247
<robgriffin247 at hotmail.com> wrote:

  
    
Is the whole thing a data frame? Then any multi-column subset is also a data frame. Try adding a as.matrix() wrapper  in the definition of rF. 

Michael
On Nov 15, 2011, at 3:14 PM, "Rob Griffin" <robgriffin247 at hotmail.com> wrote:

            
#
Excellent, as.matrix() didn't work but switched it to as.numeric() around 
the definition of both variables in the function and it did work:

rF <- function(x, a, b) cor(as.numeric(x[a]), as.numeric(x[b]), use = 
"complete.obs")
maindata$rFcor<-apply(maindata,1,FUN=rF,a=174:213,b=214:253)

Thanks very much both of you!
Rob

-----Original Message----- 
From: R. Michael Weylandt <michael.weylandt at gmail.com>
Sent: Tuesday, November 15, 2011 9:28 PM
To: Rob Griffin
Cc: Joshua Wiley ; r-help at r-project.org
Subject: Re: [R] correlations between columns for each row

Is the whole thing a data frame? Then any multi-column subset is also a data 
frame. Try adding a as.matrix() wrapper  in the definition of rF.

Michael

On Nov 15, 2011, at 3:14 PM, "Rob Griffin" <robgriffin247 at hotmail.com> 
wrote: