by-group processing

how about:

d=data[order(data$ID,-data$Type),]
d[!duplicated(d$ID),]
Given a dataframe like

  > data
        ID Type N
  1  45900    A 1
  2  45900    B 2
  3  45900    C 3
  4  45900    D 4
  5  45900    E 5
  6  45900    F 6
  7  45900    I 7
  8  49270    A 1
  9  49270    B 2
  10 49270    E 3
  18 46550    A 1
  19 46550    B 2
  20 46550    C 3
  21 46550    D 4
  22 46550    E 5
  23 46550    F 6
  24 46550    I 7
  >
containing an identifier (ID), a variable type code (Type), and
a running count of the number of records per ID (N), how can I
return a dataframe of only those records with the maximum value
of N for each ID? For instance,

  > data
        ID Type N
  7  45900    I 7
  10 49270    E 3
  24 46550    I 7

I know that I can use

   > tapply ( data $ N , data $ ID , max )
   45900 46550 49270
       7     7     3
   >
to get the values of the maximum N for each ID, but how is it
that I can find the index of these values to subsequently use to
subscript data?

--
maxine-webber

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

View this message in context: http://www.nabble.com/by-group-processing-tp23417208p23437592.html
Sent from the R help mailing list archive at Nabble.com.

by-group processing

Thread (9 messages)