Skip to content
Prev 305513 / 398506 Next

aggregate() runs out of memory

I have a large data.frame Z (2,424,185,944 bytes, 10,256,441 rows, 17 columns).
I want to get the result of
table(aggregate(Z$V1, FUN = length, by = list(id=Z$V2))$x)
alas, aggregate has been running for ~30 minute, RSS is 14G, VIRT is
24.3G, and no end in sight.
both V1 and V2 are characters (not factors).
Is there anything I could do to speed this up?
Thanks.