aggregate

11 messages · Gang Chen, David Winsemius, Jim Lemon +1 more

Original

1

11

Gang Chen

Tue, Aug 23, 2016 3:03 PM #

This is a simple question: With a dataframe like the following

myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B'))

how can I get the cross product between X and Y for each level of
factor Z? My difficulty is that I don't know how to deal with the fact
that crossprod() acts on two variables in this case.

David Winsemius

Tue, Aug 23, 2016 3:55 PM #

Just make a function that takes a dataframe and does a crossprod on two of its columns.

David Winsemius
Alameda, CA, USA

Jim Lemon

Tue, Aug 23, 2016 4:01 PM #

Hi Gang Chen,
If I have the right idea:

for(zval in levels(myData$Z))
crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")]))

Jim

On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote:

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David L Carlson

Wed, Aug 24, 2016 7:37 AM #

Thank you for the reproducible example, but it is not clear what cross product you want. Jim's solution gives you the cross product of the 2-column matrix with itself. If you want the cross product between the columns you need something else. The aggregate function will not work since it will treat the columns separately:

X Y
1 1 4
2 2 3

X  Y
X  5 10
Y 10 25

[,1]
[1,]   10

$A
   X  Y
X  5 10
Y 10 25

$B
   X  Y
X 25 10
Y 10  5

$A
     [,1]
[1,]   10

$B
     [,1]
[1,]   10

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
Sent: Tuesday, August 23, 2016 6:02 PM
To: Gang Chen; r-help mailing list
Subject: Re: [R] aggregate

Hi Gang Chen,
If I have the right idea:

for(zval in levels(myData$Z))
crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")]))

Jim

On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote:

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Gang Chen

Wed, Aug 24, 2016 8:16 AM #

Thank you all for the suggestions! Yes, I'm looking for the cross
product between the two columns of X and Y.

A follow-up question: what is a nice way to merge the output of

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

with the column Z in myData so that I would get a new dataframe as the
following (the 2nd column is the cross product between X and Y)?

Z   CP
A   10
B   10

Is the following legitimate?

data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData,
myData$Z), function(x) crossprod(x[, 1], x[, 2]))))

On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarlson at tamu.edu> wrote:

Thank you for the reproducible example, but it is not clear what cross product you want. Jim's solution gives you the cross product of the 2-column matrix with itself. If you want the cross product between the columns you need something else. The aggregate function will not work since it will treat the columns separately:

A <- as.matrix(myData[myData$Z=="A", 1:2])
A

  X Y
1 1 4
2 2 3

crossprod(A) # Same as t(A) %*% A

   X  Y
X  5 10
Y 10 25

crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2]

     [,1]
[1,]   10

# For all the groups
lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2])))

$A
   X  Y
X  5 10
Y 10 25

$B
   X  Y
X 25 10
Y 10  5

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

$A
     [,1]
[1,]   10

$B
     [,1]
[1,]   10

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
Sent: Tuesday, August 23, 2016 6:02 PM
To: Gang Chen; r-help mailing list
Subject: Re: [R] aggregate

Hi Gang Chen,
If I have the right idea:

for(zval in levels(myData$Z))
crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")]))

Jim

On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote:

This is a simple question: With a dataframe like the following

myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B'))

how can I get the cross product between X and Y for each level of
factor Z? My difficulty is that I don't know how to deal with the fact
that crossprod() acts on two variables in this case.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David L Carlson

Wed, Aug 24, 2016 8:54 AM #

Your is fine, but it will be a little simpler if you use sapply() instead:

+     function(x) crossprod(x[, 1], x[, 2])))
  Z CP
A A 10
B B 10

David C


-----Original Message-----
From: Gang Chen [mailto:gangchen6 at gmail.com] 
Sent: Wednesday, August 24, 2016 10:17 AM
To: David L Carlson
Cc: Jim Lemon; r-help mailing list
Subject: Re: [R] aggregate

Thank you all for the suggestions! Yes, I'm looking for the cross
product between the two columns of X and Y.

A follow-up question: what is a nice way to merge the output of

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

with the column Z in myData so that I would get a new dataframe as the
following (the 2nd column is the cross product between X and Y)?

Z   CP
A   10
B   10

Is the following legitimate?

data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData,
myData$Z), function(x) crossprod(x[, 1], x[, 2]))))

On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarlson at tamu.edu> wrote:

Thank you for the reproducible example, but it is not clear what cross product you want. Jim's solution gives you the cross product of the 2-column matrix with itself. If you want the cross product between the columns you need something else. The aggregate function will not work since it will treat the columns separately:

A <- as.matrix(myData[myData$Z=="A", 1:2])
A

  X Y
1 1 4
2 2 3

crossprod(A) # Same as t(A) %*% A

   X  Y
X  5 10
Y 10 25

crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2]

     [,1]
[1,]   10

# For all the groups
lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2])))

$A
   X  Y
X  5 10
Y 10 25

$B
   X  Y
X 25 10
Y 10  5

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

$A
     [,1]
[1,]   10

$B
     [,1]
[1,]   10

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
Sent: Tuesday, August 23, 2016 6:02 PM
To: Gang Chen; r-help mailing list
Subject: Re: [R] aggregate

Hi Gang Chen,
If I have the right idea:

for(zval in levels(myData$Z))
crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")]))

Jim

On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote:

This is a simple question: With a dataframe like the following

myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B'))

how can I get the cross product between X and Y for each level of
factor Z? My difficulty is that I don't know how to deal with the fact
that crossprod() acts on two variables in this case.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Gang Chen

Wed, Aug 24, 2016 9:55 AM #

Thanks a lot, David! I want to further expand the operation a little
bit. With a new dataframe:

myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4,
3, 2, 1), S=c(?S1?, ?S1?, ?S1?, ?S1?, ?S2?, ?S2?, ?S2?, ?S2?),
Z=c(?A?, ?A?, ?B?, ?B?, ?A?, ?A?, ?B?, ?B?))

X Y  S Z
1 1 8 S1 A
2 2 7 S1 A
3 3 6 S1 B
4 4 5 S1 B
5 5 4 S2 A
6 6 3 S2 A
7 7 2 S2 B
8 8 1 S2 B

I would like to obtain the same cross product between columns X and Y,
but at each combination level of factors S and Z. In other words, the
cross product would be still performed each two rows in the new
dataframe myData. How can I achieve that?

On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson <dcarlson at tamu.edu> wrote:

Your is fine, but it will be a little simpler if you use sapply() instead:

data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z),

+     function(x) crossprod(x[, 1], x[, 2])))
  Z CP
A A 10
B B 10

David C


-----Original Message-----
From: Gang Chen [mailto:gangchen6 at gmail.com]
Sent: Wednesday, August 24, 2016 10:17 AM
To: David L Carlson
Cc: Jim Lemon; r-help mailing list
Subject: Re: [R] aggregate

Thank you all for the suggestions! Yes, I'm looking for the cross
product between the two columns of X and Y.

A follow-up question: what is a nice way to merge the output of

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

with the column Z in myData so that I would get a new dataframe as the
following (the 2nd column is the cross product between X and Y)?

Z   CP
A   10
B   10

Is the following legitimate?

data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData,
myData$Z), function(x) crossprod(x[, 1], x[, 2]))))


On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarlson at tamu.edu> wrote:

Thank you for the reproducible example, but it is not clear what cross product you want. Jim's solution gives you the cross product of the 2-column matrix with itself. If you want the cross product between the columns you need something else. The aggregate function will not work since it will treat the columns separately:

A <- as.matrix(myData[myData$Z=="A", 1:2])
A

  X Y
1 1 4
2 2 3

crossprod(A) # Same as t(A) %*% A

   X  Y
X  5 10
Y 10 25

crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2]

     [,1]
[1,]   10

# For all the groups
lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2])))

$A
   X  Y
X  5 10
Y 10 25

$B
   X  Y
X 25 10
Y 10  5

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

$A
     [,1]
[1,]   10

$B
     [,1]
[1,]   10

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
Sent: Tuesday, August 23, 2016 6:02 PM
To: Gang Chen; r-help mailing list
Subject: Re: [R] aggregate

Hi Gang Chen,
If I have the right idea:

for(zval in levels(myData$Z))
crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")]))

Jim

On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote:

This is a simple question: With a dataframe like the following

myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B'))

how can I get the cross product between X and Y for each level of
factor Z? My difficulty is that I don't know how to deal with the fact
that crossprod() acts on two variables in this case.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David L Carlson

Wed, Aug 24, 2016 10:07 AM #

You need to spend some time with a basic R tutorial. Your data is messed up because you did not use a simple text editor somewhere along the way. R understands ', but not ? or ?. The best way to send data to the list is to use dput:

structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6, 
5, 4, 3, 2, 1), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L, 
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("X", 
"Y", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")

Combining two labels just requires the paste0() function:

S1A S1B S2A S2B 
 22  38  38  22

David C

-----Original Message-----
From: Gang Chen [mailto:gangchen6 at gmail.com] 
Sent: Wednesday, August 24, 2016 11:56 AM
To: David L Carlson
Cc: Jim Lemon; r-help mailing list
Subject: Re: [R] aggregate

Thanks a lot, David! I want to further expand the operation a little
bit. With a new dataframe:

myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4,
3, 2, 1), S=c(?S1?, ?S1?, ?S1?, ?S1?, ?S2?, ?S2?, ?S2?, ?S2?),
Z=c(?A?, ?A?, ?B?, ?B?, ?A?, ?A?, ?B?, ?B?))

X Y  S Z
1 1 8 S1 A
2 2 7 S1 A
3 3 6 S1 B
4 4 5 S1 B
5 5 4 S2 A
6 6 3 S2 A
7 7 2 S2 B
8 8 1 S2 B

I would like to obtain the same cross product between columns X and Y,
but at each combination level of factors S and Z. In other words, the
cross product would be still performed each two rows in the new
dataframe myData. How can I achieve that?

On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson <dcarlson at tamu.edu> wrote:

Your is fine, but it will be a little simpler if you use sapply() instead:

data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z),

+     function(x) crossprod(x[, 1], x[, 2])))
  Z CP
A A 10
B B 10

David C


-----Original Message-----
From: Gang Chen [mailto:gangchen6 at gmail.com]
Sent: Wednesday, August 24, 2016 10:17 AM
To: David L Carlson
Cc: Jim Lemon; r-help mailing list
Subject: Re: [R] aggregate

Thank you all for the suggestions! Yes, I'm looking for the cross
product between the two columns of X and Y.

A follow-up question: what is a nice way to merge the output of

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

with the column Z in myData so that I would get a new dataframe as the
following (the 2nd column is the cross product between X and Y)?

Z   CP
A   10
B   10

Is the following legitimate?

data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData,
myData$Z), function(x) crossprod(x[, 1], x[, 2]))))


On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarlson at tamu.edu> wrote:

Thank you for the reproducible example, but it is not clear what cross product you want. Jim's solution gives you the cross product of the 2-column matrix with itself. If you want the cross product between the columns you need something else. The aggregate function will not work since it will treat the columns separately:

A <- as.matrix(myData[myData$Z=="A", 1:2])
A

  X Y
1 1 4
2 2 3

crossprod(A) # Same as t(A) %*% A

   X  Y
X  5 10
Y 10 25

crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2]

     [,1]
[1,]   10

# For all the groups
lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2])))

$A
   X  Y
X  5 10
Y 10 25

$B
   X  Y
X 25 10
Y 10  5

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

$A
     [,1]
[1,]   10

$B
     [,1]
[1,]   10

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
Sent: Tuesday, August 23, 2016 6:02 PM
To: Gang Chen; r-help mailing list
Subject: Re: [R] aggregate

Hi Gang Chen,
If I have the right idea:

for(zval in levels(myData$Z))
crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")]))

Jim

On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote:

This is a simple question: With a dataframe like the following

myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B'))

how can I get the cross product between X and Y for each level of
factor Z? My difficulty is that I don't know how to deal with the fact
that crossprod() acts on two variables in this case.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Gang Chen

Wed, Aug 24, 2016 12:50 PM #

Thanks again for patiently offering great help, David! I just learned
dput() and paste0() now. Hopefully this is my last question.

Suppose a new dataframe is as below (one more numeric column):

myData <- structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6,
5, 4, 3, 2, 1), N =c(rep(2.1, 4), rep(3.2, 4)), S = structure(c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L
), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L,
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")),
.Names = c("X",
"Y", "N", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")

X Y   N  S Z
1 1 8 2.1 S1 A
2 2 7 2.1 S1 A
3 3 6 2.1 S1 B
4 4 5 2.1 S1 B
5 5 4 3.2 S2 A
6 6 3 3.2 S2 A
7 7 2 3.2 S2 B
8 8 1 3.2 S2 B

Once I obtain the cross product,

S1A S1B S2A S2B
 22  38  38  22

how can I easily add the other 3 columns (N, S, and Z) in a new
dataframe? For S and Z, I can play with the names from the cross
product output, but I have trouble dealing with the numeric column N.

On Wed, Aug 24, 2016 at 1:07 PM, David L Carlson <dcarlson at tamu.edu> wrote:

You need to spend some time with a basic R tutorial. Your data is messed up because you did not use a simple text editor somewhere along the way. R understands ', but not ? or ?. The best way to send data to the list is to use dput:

dput(myData)

structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6,
5, 4, 3, 2, 1), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L,
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("X",
"Y", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")

Combining two labels just requires the paste0() function:

sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 1], x[, 2]))

S1A S1B S2A S2B
 22  38  38  22

David C

-----Original Message-----
From: Gang Chen [mailto:gangchen6 at gmail.com]
Sent: Wednesday, August 24, 2016 11:56 AM
To: David L Carlson
Cc: Jim Lemon; r-help mailing list
Subject: Re: [R] aggregate

Thanks a lot, David! I want to further expand the operation a little
bit. With a new dataframe:

myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4,
3, 2, 1), S=c(?S1?, ?S1?, ?S1?, ?S1?, ?S2?, ?S2?, ?S2?, ?S2?),
Z=c(?A?, ?A?, ?B?, ?B?, ?A?, ?A?, ?B?, ?B?))

myData

  X Y  S Z
1 1 8 S1 A
2 2 7 S1 A
3 3 6 S1 B
4 4 5 S1 B
5 5 4 S2 A
6 6 3 S2 A
7 7 2 S2 B
8 8 1 S2 B

I would like to obtain the same cross product between columns X and Y,
but at each combination level of factors S and Z. In other words, the
cross product would be still performed each two rows in the new
dataframe myData. How can I achieve that?

On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson <dcarlson at tamu.edu> wrote:

Your is fine, but it will be a little simpler if you use sapply() instead:

data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z),

+     function(x) crossprod(x[, 1], x[, 2])))
  Z CP
A A 10
B B 10

David C


-----Original Message-----
From: Gang Chen [mailto:gangchen6 at gmail.com]
Sent: Wednesday, August 24, 2016 10:17 AM
To: David L Carlson
Cc: Jim Lemon; r-help mailing list
Subject: Re: [R] aggregate

Thank you all for the suggestions! Yes, I'm looking for the cross
product between the two columns of X and Y.

A follow-up question: what is a nice way to merge the output of

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

with the column Z in myData so that I would get a new dataframe as the
following (the 2nd column is the cross product between X and Y)?

Z   CP
A   10
B   10

Is the following legitimate?

data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData,
myData$Z), function(x) crossprod(x[, 1], x[, 2]))))


On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarlson at tamu.edu> wrote:

Thank you for the reproducible example, but it is not clear what cross product you want. Jim's solution gives you the cross product of the 2-column matrix with itself. If you want the cross product between the columns you need something else. The aggregate function will not work since it will treat the columns separately:

A <- as.matrix(myData[myData$Z=="A", 1:2])
A

  X Y
1 1 4
2 2 3

crossprod(A) # Same as t(A) %*% A

   X  Y
X  5 10
Y 10 25

crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2]

     [,1]
[1,]   10

# For all the groups
lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2])))

$A
   X  Y
X  5 10
Y 10 25

$B
   X  Y
X 25 10
Y 10  5

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

$A
     [,1]
[1,]   10

$B
     [,1]
[1,]   10

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
Sent: Tuesday, August 23, 2016 6:02 PM
To: Gang Chen; r-help mailing list
Subject: Re: [R] aggregate

Hi Gang Chen,
If I have the right idea:

for(zval in levels(myData$Z))
crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")]))

Jim

On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote:

This is a simple question: With a dataframe like the following

myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B'))

how can I get the cross product between X and Y for each level of
factor Z? My difficulty is that I don't know how to deal with the fact
that crossprod() acts on two variables in this case.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David L Carlson

Wed, Aug 24, 2016 1:24 PM #

This will work, but you should double-check to be certain that CP and unique(myData[, 3:5]) are in the same order. It will fail if N is not identical for all rows of the same S-Z combination.

+       crossprod(x[, 1], x[, 2]))

CP   N  S Z
S1A 22 2.1 S1 A
S1B 38 2.1 S1 B
S2A 38 3.2 S2 A
S2B 22 3.2 S2 B

David C

-----Original Message-----
From: Gang Chen [mailto:gangchen6 at gmail.com] 
Sent: Wednesday, August 24, 2016 2:51 PM
To: David L Carlson
Cc: r-help mailing list
Subject: Re: [R] aggregate

Thanks again for patiently offering great help, David! I just learned
dput() and paste0() now. Hopefully this is my last question.

Suppose a new dataframe is as below (one more numeric column):

myData <- structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6,
5, 4, 3, 2, 1), N =c(rep(2.1, 4), rep(3.2, 4)), S = structure(c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L
), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L,
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")),
.Names = c("X",
"Y", "N", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")

X Y   N  S Z
1 1 8 2.1 S1 A
2 2 7 2.1 S1 A
3 3 6 2.1 S1 B
4 4 5 2.1 S1 B
5 5 4 3.2 S2 A
6 6 3 3.2 S2 A
7 7 2 3.2 S2 B
8 8 1 3.2 S2 B

Once I obtain the cross product,

S1A S1B S2A S2B
 22  38  38  22

how can I easily add the other 3 columns (N, S, and Z) in a new
dataframe? For S and Z, I can play with the names from the cross
product output, but I have trouble dealing with the numeric column N.

On Wed, Aug 24, 2016 at 1:07 PM, David L Carlson <dcarlson at tamu.edu> wrote:

You need to spend some time with a basic R tutorial. Your data is messed up because you did not use a simple text editor somewhere along the way. R understands ', but not ? or ?. The best way to send data to the list is to use dput:

dput(myData)

structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6,
5, 4, 3, 2, 1), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L,
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("X",
"Y", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")

Combining two labels just requires the paste0() function:

sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 1], x[, 2]))

S1A S1B S2A S2B
 22  38  38  22

David C

-----Original Message-----
From: Gang Chen [mailto:gangchen6 at gmail.com]
Sent: Wednesday, August 24, 2016 11:56 AM
To: David L Carlson
Cc: Jim Lemon; r-help mailing list
Subject: Re: [R] aggregate

Thanks a lot, David! I want to further expand the operation a little
bit. With a new dataframe:

myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4,
3, 2, 1), S=c(?S1?, ?S1?, ?S1?, ?S1?, ?S2?, ?S2?, ?S2?, ?S2?),
Z=c(?A?, ?A?, ?B?, ?B?, ?A?, ?A?, ?B?, ?B?))

myData

  X Y  S Z
1 1 8 S1 A
2 2 7 S1 A
3 3 6 S1 B
4 4 5 S1 B
5 5 4 S2 A
6 6 3 S2 A
7 7 2 S2 B
8 8 1 S2 B

I would like to obtain the same cross product between columns X and Y,
but at each combination level of factors S and Z. In other words, the
cross product would be still performed each two rows in the new
dataframe myData. How can I achieve that?

On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson <dcarlson at tamu.edu> wrote:

Your is fine, but it will be a little simpler if you use sapply() instead:

data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z),

+     function(x) crossprod(x[, 1], x[, 2])))
  Z CP
A A 10
B B 10

David C


-----Original Message-----
From: Gang Chen [mailto:gangchen6 at gmail.com]
Sent: Wednesday, August 24, 2016 10:17 AM
To: David L Carlson
Cc: Jim Lemon; r-help mailing list
Subject: Re: [R] aggregate

Thank you all for the suggestions! Yes, I'm looking for the cross
product between the two columns of X and Y.

A follow-up question: what is a nice way to merge the output of

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

with the column Z in myData so that I would get a new dataframe as the
following (the 2nd column is the cross product between X and Y)?

Z   CP
A   10
B   10

Is the following legitimate?

data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData,
myData$Z), function(x) crossprod(x[, 1], x[, 2]))))


On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarlson at tamu.edu> wrote:

Thank you for the reproducible example, but it is not clear what cross product you want. Jim's solution gives you the cross product of the 2-column matrix with itself. If you want the cross product between the columns you need something else. The aggregate function will not work since it will treat the columns separately:

A <- as.matrix(myData[myData$Z=="A", 1:2])
A

  X Y
1 1 4
2 2 3

crossprod(A) # Same as t(A) %*% A

   X  Y
X  5 10
Y 10 25

crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2]

     [,1]
[1,]   10

# For all the groups
lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2])))

$A
   X  Y
X  5 10
Y 10 25

$B
   X  Y
X 25 10
Y 10  5

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

$A
     [,1]
[1,]   10

$B
     [,1]
[1,]   10

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
Sent: Tuesday, August 23, 2016 6:02 PM
To: Gang Chen; r-help mailing list
Subject: Re: [R] aggregate

Hi Gang Chen,
If I have the right idea:

for(zval in levels(myData$Z))
crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")]))

Jim

On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote:

This is a simple question: With a dataframe like the following

myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B'))

how can I get the cross product between X and Y for each level of
factor Z? My difficulty is that I don't know how to deal with the fact
that crossprod() acts on two variables in this case.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Gang Chen

Wed, Aug 24, 2016 1:42 PM #

Yes, this works out perfectly! Thanks a lot, David. Have a wonderful day...

On Wed, Aug 24, 2016 at 4:24 PM, David L Carlson <dcarlson at tamu.edu> wrote:

This will work, but you should double-check to be certain that CP and unique(myData[, 3:5]) are in the same order. It will fail if N is not identical for all rows of the same S-Z combination.

CP <- sapply(split(myData, paste0(myData$S, myData$Z)), function(x)

+       crossprod(x[, 1], x[, 2]))

data.frame(CP, unique(myData[, 3:5]))

    CP   N  S Z
S1A 22 2.1 S1 A
S1B 38 2.1 S1 B
S2A 38 3.2 S2 A
S2B 22 3.2 S2 B

David C

-----Original Message-----
From: Gang Chen [mailto:gangchen6 at gmail.com]
Sent: Wednesday, August 24, 2016 2:51 PM
To: David L Carlson
Cc: r-help mailing list
Subject: Re: [R] aggregate

Thanks again for patiently offering great help, David! I just learned
dput() and paste0() now. Hopefully this is my last question.

Suppose a new dataframe is as below (one more numeric column):

myData <- structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6,
5, 4, 3, 2, 1), N =c(rep(2.1, 4), rep(3.2, 4)), S = structure(c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L
), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L,
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")),
.Names = c("X",
"Y", "N", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")

myData

  X Y   N  S Z
1 1 8 2.1 S1 A
2 2 7 2.1 S1 A
3 3 6 2.1 S1 B
4 4 5 2.1 S1 B
5 5 4 3.2 S2 A
6 6 3 3.2 S2 A
7 7 2 3.2 S2 B
8 8 1 3.2 S2 B

Once I obtain the cross product,

sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 1], x[, 2]))

S1A S1B S2A S2B
 22  38  38  22

how can I easily add the other 3 columns (N, S, and Z) in a new
dataframe? For S and Z, I can play with the names from the cross
product output, but I have trouble dealing with the numeric column N.




On Wed, Aug 24, 2016 at 1:07 PM, David L Carlson <dcarlson at tamu.edu> wrote:

You need to spend some time with a basic R tutorial. Your data is messed up because you did not use a simple text editor somewhere along the way. R understands ', but not ? or ?. The best way to send data to the list is to use dput:

dput(myData)

structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8), Y = c(8, 7, 6,
5, 4, 3, 2, 1), S = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
), .Label = c("S1", "S2"), class = "factor"), Z = structure(c(1L,
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("X",
"Y", "S", "Z"), row.names = c(NA, -8L), class = "data.frame")

Combining two labels just requires the paste0() function:

sapply(split(myData, paste0(myData$S, myData$Z)), function(x) crossprod(x[, 1], x[, 2]))

S1A S1B S2A S2B
 22  38  38  22

David C

-----Original Message-----
From: Gang Chen [mailto:gangchen6 at gmail.com]
Sent: Wednesday, August 24, 2016 11:56 AM
To: David L Carlson
Cc: Jim Lemon; r-help mailing list
Subject: Re: [R] aggregate

Thanks a lot, David! I want to further expand the operation a little
bit. With a new dataframe:

myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4,
3, 2, 1), S=c(?S1?, ?S1?, ?S1?, ?S1?, ?S2?, ?S2?, ?S2?, ?S2?),
Z=c(?A?, ?A?, ?B?, ?B?, ?A?, ?A?, ?B?, ?B?))

myData

  X Y  S Z
1 1 8 S1 A
2 2 7 S1 A
3 3 6 S1 B
4 4 5 S1 B
5 5 4 S2 A
6 6 3 S2 A
7 7 2 S2 B
8 8 1 S2 B

I would like to obtain the same cross product between columns X and Y,
but at each combination level of factors S and Z. In other words, the
cross product would be still performed each two rows in the new
dataframe myData. How can I achieve that?

On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson <dcarlson at tamu.edu> wrote:

Your is fine, but it will be a little simpler if you use sapply() instead:

data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z),

+     function(x) crossprod(x[, 1], x[, 2])))
  Z CP
A A 10
B B 10

David C


-----Original Message-----
From: Gang Chen [mailto:gangchen6 at gmail.com]
Sent: Wednesday, August 24, 2016 10:17 AM
To: David L Carlson
Cc: Jim Lemon; r-help mailing list
Subject: Re: [R] aggregate

Thank you all for the suggestions! Yes, I'm looking for the cross
product between the two columns of X and Y.

A follow-up question: what is a nice way to merge the output of

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

with the column Z in myData so that I would get a new dataframe as the
following (the 2nd column is the cross product between X and Y)?

Z   CP
A   10
B   10

Is the following legitimate?

data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData,
myData$Z), function(x) crossprod(x[, 1], x[, 2]))))


On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarlson at tamu.edu> wrote:

Thank you for the reproducible example, but it is not clear what cross product you want. Jim's solution gives you the cross product of the 2-column matrix with itself. If you want the cross product between the columns you need something else. The aggregate function will not work since it will treat the columns separately:

A <- as.matrix(myData[myData$Z=="A", 1:2])
A

  X Y
1 1 4
2 2 3

crossprod(A) # Same as t(A) %*% A

   X  Y
X  5 10
Y 10 25

crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2]

     [,1]
[1,]   10

# For all the groups
lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2])))

$A
   X  Y
X  5 10
Y 10 25

$B
   X  Y
X 25 10
Y 10  5

lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))

$A
     [,1]
[1,]   10

$B
     [,1]
[1,]   10

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
Sent: Tuesday, August 23, 2016 6:02 PM
To: Gang Chen; r-help mailing list
Subject: Re: [R] aggregate

Hi Gang Chen,
If I have the right idea:

for(zval in levels(myData$Z))
crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")]))

Jim

On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote:

This is a simple question: With a dataframe like the following

myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B'))

how can I get the cross product between X and Y for each level of
factor Z? My difficulty is that I don't know how to deal with the fact
that crossprod() acts on two variables in this case.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.