aggregate
You need to spend some time with a basic R tutorial. Your data is messed up because you did not use a simple text editor somewhere along the way. R understands ', but not ? or ?. The best way to send data to the list is to use dput:
dput(myData)
structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6,
5, 4, 3, 2, 1), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L,
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("X",
"Y", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")
Combining two labels just requires the paste0() function:
sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 1], x[, 2]))
S1A S1B S2A S2B 22 38 38 22 David C -----Original Message----- From: Gang Chen [mailto:gangchen6 at gmail.com] Sent: Wednesday, August 24, 2016 11:56 AM To: David L Carlson Cc: Jim Lemon; r-help mailing list Subject: Re: [R] aggregate Thanks a lot, David! I want to further expand the operation a little bit. With a new dataframe: myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4, 3, 2, 1), S=c(?S1?, ?S1?, ?S1?, ?S1?, ?S2?, ?S2?, ?S2?, ?S2?), Z=c(?A?, ?A?, ?B?, ?B?, ?A?, ?A?, ?B?, ?B?))
myData
X Y S Z 1 1 8 S1 A 2 2 7 S1 A 3 3 6 S1 B 4 4 5 S1 B 5 5 4 S2 A 6 6 3 S2 A 7 7 2 S2 B 8 8 1 S2 B I would like to obtain the same cross product between columns X and Y, but at each combination level of factors S and Z. In other words, the cross product would be still performed each two rows in the new dataframe myData. How can I achieve that?
On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson <dcarlson at tamu.edu> wrote:
Your is fine, but it will be a little simpler if you use sapply() instead:
data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z),
+ function(x) crossprod(x[, 1], x[, 2]))) Z CP A A 10 B B 10 David C -----Original Message----- From: Gang Chen [mailto:gangchen6 at gmail.com] Sent: Wednesday, August 24, 2016 10:17 AM To: David L Carlson Cc: Jim Lemon; r-help mailing list Subject: Re: [R] aggregate Thank you all for the suggestions! Yes, I'm looking for the cross product between the two columns of X and Y. A follow-up question: what is a nice way to merge the output of lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2])) with the column Z in myData so that I would get a new dataframe as the following (the 2nd column is the cross product between X and Y)? Z CP A 10 B 10 Is the following legitimate? data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2])))) On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarlson at tamu.edu> wrote:
Thank you for the reproducible example, but it is not clear what cross product you want. Jim's solution gives you the cross product of the 2-column matrix with itself. If you want the cross product between the columns you need something else. The aggregate function will not work since it will treat the columns separately:
A <- as.matrix(myData[myData$Z=="A", 1:2]) A
X Y 1 1 4 2 2 3
crossprod(A) # Same as t(A) %*% A
X Y X 5 10 Y 10 25
crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2]
[,1] [1,] 10
# For all the groups lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2])))
$A X Y X 5 10 Y 10 25 $B X Y X 25 10 Y 10 5
lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))
$A
[,1]
[1,] 10
$B
[,1]
[1,] 10
-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
Sent: Tuesday, August 23, 2016 6:02 PM
To: Gang Chen; r-help mailing list
Subject: Re: [R] aggregate
Hi Gang Chen,
If I have the right idea:
for(zval in levels(myData$Z))
crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")]))
Jim
On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote:
This is a simple question: With a dataframe like the following
myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B'))
how can I get the cross product between X and Y for each level of
factor Z? My difficulty is that I don't know how to deal with the fact
that crossprod() acts on two variables in this case.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.