Skip to content

Counting number of rows with two criteria in dataframe

8 messages · Henrique Dallazuanna, Ista Zahn, David Winsemius +4 more

#
Hi Ryan,
One option would be

X$a <- paste(X$x, X$y, sep=".")
table(X$a)

Best,
Ista
On Tue, Jan 25, 2011 at 2:25 PM, Ryan Utz <utz.ryan at gmail.com> wrote:

  
    
#
On Jan 25, 2011, at 2:25 PM, Ryan Utz wrote:

            
> tapply(X$z, list(X$x, X$y), function(xx) length(unique(xx)) )
    1  2  3  4  5  6
1  2  2 NA NA NA NA
2 NA NA  2  2 NA NA
3 NA NA NA NA  2  2
David Winsemius, MD
West Hartford, CT
#
Note that a key is not actually required, so it's even simpler syntax :

dX = as.data.table(X)
dX[,length(unique(z)),by="x,y"]
     x y V1
[1,] 1 1  2
[2,] 1 2  2
[3,] 2 3  2
[4,] 2 4  2
[5,] 3 5  2
[6,] 3 6  2

or passing list() syntax to the 'by' is exactly the same :

dX[,length(unique(z)),by=list(x,y)]

The advantage of using the list() form is you can group by expressions
of columns, for example if x was a date column :

dX[,length(unique(z)),by=list(month(x),y)]

Matthew


"Dennis Murphy" <djmuser at gmail.com> wrote in message 
news:AANLkTi=8TYSrRfzfm01m7fpzydh-cLS-J-cMbkAkjXxf at mail.gmail.com...
#
On Wed, Jan 26, 2011 at 5:27 AM, Dennis Murphy <djmuser at gmail.com> wrote:
Another approach is to use the much faster count function:

count(unique(X))

Hadley