Counting occurences of variables in a dataframe

Sat, Feb 11, 2012 10:59 AM

On Sat, Feb 11, 2012 at 07:17:54PM +0100, Kai Mx wrote:

Hi.

Is the first 2 in the new variable due to the fact that
the name is "ab" and "ab" at row 5 has older date? If so,
then try the following

  ind <- order(kdata$kdate)
  f <- function(x) seq.int(along.with=x)
  kdata$x <- ave(1:nrow(kdata), kdata$knames[ind], FUN=f)[order(ind)]

     knames      kdate x
  1      ab 2011-10-01 2
  2      aa 2011-11-02 2
  3      ac 2010-10-01 1
  4      ad 2010-03-15 1
  5      ab 2010-12-01 1
  6      ac 2011-01-05 2
  7      aa 2010-10-01 1
  8      ad 2011-05-04 2
  9      ae 2011-06-03 1
  10     af 2011-02-01 1

kdata$knames[ind] orders the names by increasing date.
ave(...)[order(ind)] reorders the output of ave() to the original order.

Hope this helps.

Petr Savicky.

Counting occurences of variables in a dataframe

Thread (6 messages)