by-group processing
how about: d=data[order(data$ID,-data$Type),] d[!duplicated(d$ID),]
Max Webber wrote:
Given a dataframe like
> data
ID Type N 1 45900 A 1 2 45900 B 2 3 45900 C 3 4 45900 D 4 5 45900 E 5 6 45900 F 6 7 45900 I 7 8 49270 A 1 9 49270 B 2 10 49270 E 3 18 46550 A 1 19 46550 B 2 20 46550 C 3 21 46550 D 4 22 46550 E 5 23 46550 F 6 24 46550 I 7
>
containing an identifier (ID), a variable type code (Type), and a running count of the number of records per ID (N), how can I return a dataframe of only those records with the maximum value of N for each ID? For instance,
> data
ID Type N 7 45900 I 7 10 49270 E 3 24 46550 I 7 I know that I can use
> tapply ( data $ N , data $ ID , max )
45900 46550 49270
7 7 3
>
to get the values of the maximum N for each ID, but how is it that I can find the index of these values to subsequently use to subscript data? -- maxine-webber
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
View this message in context: http://www.nabble.com/by-group-processing-tp23417208p23437592.html Sent from the R help mailing list archive at Nabble.com.