Skip to content

multivariate version of aggregate

9 messages · Jannis, David L Carlson, Greg Snow +2 more

#
Dear List members,


i am seeking a multivariate version of aggregate. I want to compute, fro 
example the correlation between subsets of two vectors. In aggregate, i 
can only supply one vector with indices for subsets. Is  there ready 
function for this or do i need to program my own?


Cheers
Jannis
#
You can pass a matrix to by()
g=rep(LETTERS[1:2], each=25))
[1] -0.05643063  0.16465040

-------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of Jannis
Sent: Thursday, June 27, 2013 12:27 PM
To: r-help
Subject: [R] multivariate version of aggregate

Dear List members,


i am seeking a multivariate version of aggregate. I want to compute,
fro 
example the correlation between subsets of two vectors. In
aggregate, i 
can only supply one vector with indices for subsets. Is  there ready

function for this or do i need to program my own?


Cheers
Jannis

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Hello,

Or use ?sapply.

sapply(split(dat[1:2], dat[3]), function(x) cor(x[1], x[2]))


Hope this helps,

Rui Barradas

Em 27-06-2013 20:22, David Carlson escreveu:
#
Hi,
May be this also helps:
library(data.table)
dt1<- data.table(dat)
dt1[,cor(x,y),by=g]
#?? g????????? V1
#1: A -0.05643063
#2: B? 0.16465040
dt1[,cor(x,y),by=g]$V1
#[1] -0.05643063? 0.16465040

A.K.

----- Original Message -----
From: David Carlson <dcarlson at tamu.edu>
To: 'Jannis' <bt_jannis at yahoo.de>; 'r-help' <r-help at r-project.org>
Cc: 
Sent: Thursday, June 27, 2013 3:22 PM
Subject: Re: [R] multivariate version of aggregate

You can pass a matrix to by()
g=rep(LETTERS[1:2], each=25))
[1] -0.05643063? 0.16465040

-------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of Jannis
Sent: Thursday, June 27, 2013 12:27 PM
To: r-help
Subject: [R] multivariate version of aggregate

Dear List members,


i am seeking a multivariate version of aggregate. I want to compute,
fro 
example the correlation between subsets of two vectors. In
aggregate, i 
can only supply one vector with indices for subsets. Is? there ready

function for this or do i need to program my own?


Cheers
Jannis

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Yes, I had a look at that function. From the documentation, however, it 
did not get clear to me how to split the dataframe into subsets of rows 
based on an index argument. Like:


testframe <- data.frame(a=rnorm(100), b = rnorm(100))
indices      <- rep(c(1,2), each = 50)


results <- ddply(.data = testframe, INDICES= indices, .fun = function(x) 
corr(x[,1], x[,2]))

Where the last command would yield the correlations between column 1 and 
2 of the first 50 and of the last 50 values.

Any ideas?

Jannis
On 27.06.2013 21:43, Greg Snow wrote:
#
Hello,

You can solve your problem using only base R, with no need for an 
external package. The two instrucitons below are two ways of doing the same.



sapply(split(testframe, indices), function(x) cor(x[, 1], x[, 2]))

as.vector(by(testframe, indices, function(x) cor(x[, 1], x[, 2])))



Hope this helps,

Rui Barradas

Em 28-06-2013 09:31, Jannis escreveu:
#
Thanks a lot to everybody who responded! My solution now looks similar 
to Ruis and Davids suggestions.


Jannis
On 28.06.2013 11:00, Rui Barradas wrote:
#
Hi,
?set.seed(45)
?testframe <- data.frame(a=rnorm(100), b = rnorm(100))
?indices????? <- rep(c(1,2), each = 50)
library(plyr)

ddply(testframe,.(indices),summarize, Cor1=cor(a,b))

#? indices???????? Cor1
#1?????? 1? 0.002770524
#2?????? 2 -0.101738888


A.K.


----- Original Message -----
From: Jannis <bt_jannis at yahoo.de>
To: Greg Snow <538280 at gmail.com>
Cc: r-help <r-help at r-project.org>
Sent: Friday, June 28, 2013 4:31 AM
Subject: Re: [R] multivariate version of aggregate

Yes, I had a look at that function. From the documentation, however, it 
did not get clear to me how to split the dataframe into subsets of rows 
based on an index argument. Like:


testframe <- data.frame(a=rnorm(100), b = rnorm(100))
indices? ? ? <- rep(c(1,2), each = 50)


results <- ddply(.data = testframe, INDICES= indices, .fun = function(x) 
corr(x[,1], x[,2]))

Where the last command would yield the correlations between column 1 and 
2 of the first 50 and of the last 50 values.

Any ideas?

Jannis
On 27.06.2013 21:43, Greg Snow wrote:
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.