Number of replications of a term
Note that that assumes that all occurrences of a value are contiguous.
On 1/24/06, Ray Brownrigg <ray at mcs.vuw.ac.nz> wrote:
There's an even faster one, which nobody seems to have mentioned yet: rep(l <- rle(ids)$lengths, l) Timing on my 2.8GHz NetBSD system shows:
length(ids)
[1] 45150
# Gabor: system.time(for (i in 1:100) ave(as.numeric(factor(ids)), ids, FUN =
length)) [1] 3.45 0.06 3.54 0.00 0.00
# Barry (and others I think): system.time(for (i in 1:100) table(ids)[ids])
[1] 2.13 0.05 2.20 0.00 0.00
Me: system.time(for (i in 1:100) rep(l <- rle(ids)$lengths, l))
[1] 1.60 0.00 1.62 0.00 0.00 Of course the difference between 21 milliseconds and 16 milliseconds is not great, unless you are doing this a lot. Ray Brownrigg
From: Gabor Grothendieck <ggrothendieck at gmail.com> Nice. I timed it and its much faster than mine too. On 1/24/06, Barry Rowlingson <B.Rowlingson at lancaster.ac.uk> wrote:
Laetitia Marisa wrote:
Hello, Is there a simple and fast function that returns a vector of the number of replications for each object of a vector ? For example : I have a vector of IDs : ids <- c( "ID1", "ID2", "ID2", "ID3", "ID3","ID3", "ID5") I want the function returns the following vector where each term is the number of replicates for the given id : c( 1, 2, 2, 3,3,3,1 )
One-liner:
> table(ids)[ids]
ids ID1 ID2 ID2 ID3 ID3 ID3 ID5 1 2 2 3 3 3 1 'table(ids)' computes the counts, then the subscripting [ids] looks it all up. Now try it on your 40,000-long vector! Barry