Skip to content
Back to formatted view

Raw Message

Message-ID: <47fce0650510131214k5eeaaf41m@mail.gmail.com>
Date: 2005-10-13T19:14:24Z
From: Hans-Peter
Subject: aggregate slow with many rows - alternative?

Hi,

I use the code below to aggregate / cnt my test data. It works fine,
but the problem is with my real data (33'000 rows) where the function
is really slow (nothing happened in half an hour).

Does anybody know of other functions that I could use?

Thanks,
Hans-Peter

--------------
dat <- data.frame( Datum  = c( 32586, 32587, 32587, 32625, 32656,
32656, 32656, 32672, 32672, 32699 ),
              FischerID = c( 58395, 58395, 58395, 88434, 89953, 89953,
89953, 64395, 62896, 62870 ),
              Anzahl = c( 2, 2, 1, 1, 2, 1, 7, 1, 1, 2 ) )
f <- function(x) data.frame( Datum = x[1,1], FischerID = x[1,2],
Anzahl = sum( x[,3] ), Cnt = dim( x )[1] )
t.a <- do.call("rbind", by(dat, dat[,1:2], f))   # slow for 33'000 rows
t.a <- t.a[order( t.a[,1], t.a[,2] ),]

  # show data
dat
t.a