subset data based on values in multiple columns
Hi
Dear list members, I am trying to create a subset of a data frame based on conditions in
two
columns, and after spending much time trying (and search R-help) have
not
had any luck. Essentially, I have a data frame that is something like
this:
date<-as.POSIXct(as.character(c
("2012-01-25","2012-01-25","2012-01-26","2012-01-27","2012-01-27","2012-01-27")))
time<-as.POSIXct(as.character(c("13:20", "13:40", "14:00", "10:00", "10:
20", "10:20")), format="%H:%M")
count<-c(12,14,11,12,12,8)
data<-data.frame(date,time,count)
which looks like:
date time count
1 2012-01-25 13:20:00 12
2 2012-01-25 13:40:00 14
3 2012-01-26 14:00:00 11
4 2012-01-27 10:00:00 12
5 2012-01-27 10:20:00 12
6 2012-01-27 10:20:00 8
I would like to create a subset by doing the following: for each unique
date, only include one case which will be the case with the max value
for
the column labelled "count". So the resulting subset would be:
date time count
2 2012-01-25 13:40:00 14
3 2012-01-26 14:00:00 11
4 2012-01-27 10:00:00 12
Some dates have two cases at which the "count" was the same, but I only
want to include one case (I don't really mind which case it chooses, but
if need be it could be based on the earliest "time" for which the same counts occurred). I have tried various loops with no success! I'm sure that there is an easy answer that I have not found! Any help is much
appreciated!!
Just few days ago similarquestion was asked (selecting rows by maximum
value of one variables in dataframe nested by another Variable). Here is
what was recommended.
do.call("rbind",lapply(split(data, data$date), function(x)
x[which.max(x[,2]),]))
Regards
Petr
All the best, Chandra [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.