Skip to content
Back to formatted view

Raw Message

Message-ID: <87626afkhr.fsf@gnu.org>
Date: 2012-10-16T16:29:52Z
From: Sam Steingold
Subject: uniq -c
In-Reply-To: <CAAmySGNg-++Q5WHRktQR3Q+SzWy08Mm9M64bQGcb1oDYcSVvfw@mail.gmail.com> (R. Michael Weylandt's message of "Tue, 16 Oct 2012 16:19:27 +0100")

> * R. Michael Weylandt <zvpunry.jrlynaqg at tznvy.pbz> [2012-10-16 16:19:27 +0100]:
>
> Have you looked at using table() directly? If I understand what you
> want correctly something like:
>
> table(do.call(paste, x))

I wished to avoid paste (I will have to re-split later, so it will be a
performance nightmare).

> Also, if you take a look at the development version of R, changes are
> being put in place to allow much larger data sets.
>>
>> xtabs(), although dog slow, would have footed the bill nicely:
>> --8<---------------cut here---------------start------------->8---
>>> x <- data.frame(a=1:32,b=1:32,c=1:32,d=1:32,e=1:32)
>>> system.time(subset(as.data.frame(xtabs( ~. , x )), Freq != 0 ))
>>    user  system elapsed
>>  12.788   4.288  17.224
>> --8<---------------cut here---------------end--------------->8---

you should not need "much larger data sets" for this.
x is sorted.

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://openvotingconsortium.org http://iris.org.il
http://www.memritv.org http://memri.org http://think-israel.org
Just because you're paranoid doesn't mean they AREN'T after you.