Skip to content

Group by multiple variables

3 messages · Mendolia, Franco, Baptiste Auguie, Sébastien Bihorel

#
Hello,

I would like to create a group variable that is based on the values of three variables:

For example,
B=c(1,1,1,5,5,5,9,9,9,9),
                    C=c(1,1,1,1,1,2,2,7,7,7))
A B C
 1  1 1 1
 2  1 1 1
 3  1 1 1
 4  1 5 1
 5  1 5 1
 6  2 5 2
 7  2 9 2
 8  2 9 7
 9  2 9 7
10 2 9 7

All rows that that are equal, belong to the same group, that is I would like to create a group variable like this:

   A B C group
 1  1 1 1   1
 2  1 1 1   1
 3  1 1 1   1
 4  1 5 1   2
 5  1 5 1   2
 6  2 5 2   3
 7  2 9 2   4
 8  2 9 7   5
 9  2 9 7   5
10 2 9 7   5

Is there an easy way to do this? Right now I use a bunch of loops and that seems rather cumbersome. In general, my data set is rather large and the number of categories per variable is not always fixed.

Franco
#
Hi,

There are probably much better ways, but try this

transform(dat, group = as.numeric(factor(paste(A,B,C, sep=""))))

HTH,

baptiste
On 31 May 2011 09:47, Mendolia, Franco <fmendolia at mcw.edu> wrote:
#
On Mon, 30 May 2011 16:47:45 -0500,
"Mendolia, Franco" <fmendolia at mcw.edu> wrote:

            
[...]

One option:

dat <- within(dat, {grp <- factor(paste(A, B, C))})