-----Original Message-----
From: Peter Ehlers [mailto:ehlers at ucalgary.ca]
Sent: Friday, August 24, 2012 10:51 AM
To: William Dunlap
Cc: Peter Alspach; rjb; r-help at r-project.org
Subject: Re: [R] Extracting data from dataframe with tied rows
Here's another pretty straightforward solution, using the plyr pkg:
DF <- data.frame(id, month, distance, bearing)
# variables as defined in the OP
require(plyr)
DF1<-ddply(DF, .(id,month), summarize,
maxdist = max(distance),
maxbearing = bearing[which.max(distance)])
Peter Ehlers
On 2012-08-24 09:54, William Dunlap wrote:
Or use ave() to compute the within-group ranks (reversed, so max has rank 1) and
the elements whose ranks are 1:
f2 <- function (DATA)
{
stopifnot(is.data.frame(DATA), all(c("distance", "id", "month") %in%
names(DATA)))
revRanks <- ave(DATA[["distance"]], DATA[["id"]], DATA[["month"]],
FUN = function(x) rank(-x, ties = "first"))
DATA[revRanks == 1, ]
}
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Of Peter Alspach
Sent: Thursday, August 23, 2012 4:37 PM
To: rjb; r-help at r-project.org
Subject: Re: [R] Extracting data from dataframe with tied rows
Tena koe John
One way:
johnData <- data.frame(id=rep(LETTERS[1:5],20), distance=rnorm(1:100, mean = 100),
bearing=sample(1:360,100,replace=T), month=sample(1:12,100,replace=T))
johnAgg <- aggregate(johnData[,'distance'], johnData[,c('id','month')], max)
names(johnAgg)[3] <- 'distance'
merge(johnAgg, johnData)
HTH ....
Peter Alspach
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Of rjb
Sent: Friday, 24 August 2012 9:19 a.m.
To: r-help at r-project.org
Subject: [R] Extracting data from dataframe with tied rows
Hi R help,
I'm a fairly experienced R user but this manipulation has me stumped, please
help:
DATA
id<-rep(LETTERS[1:5],20)
distance<-rnorm(1:100, mean = 100)
bearing<-sample(1:360,100,replace=T)
month<-sample(1:12,100,replace=T)
I have a dataset with records of individuals (id) , each with a distance
(distance) & direction (bearing) recorded for each month (month).
I want to find the largest distance per individual per month, which is easy
with /tapply/ or /melt/cast (reshape)/,
head(DATA_m<-melt(DATA,id=c("id","month")))
cast(DATA_m,id+month~.,max)
OR
na.omit(melt(tapply(distance,list(id,month),max)))
*BUT THE CATCH IS* ,
I also want the the *corresponding* bearing for that maximum distance per
month. I've tried the steps above plus using which.max() and loops, but
can't solve the problem. The real dataset is about 6000 rows.
I'm guessing the answer is in finding the row number from the original DATA
but I can't figure how to do that with tapply or melt.
Any suggestions would be greatly appreciated.
John Burnside