Message-ID: <1349908569.64213.YahooMailNeo@web142602.mail.bf1.yahoo.com>
Date: 2012-10-10T22:36:09Z
From: arun
Subject: How to replicate SAS by group processing in R
In-Reply-To: <90F81DB5-8714-4E2A-8048-DE427B953004@comcast.net>
Hi,
You can also try with aggregate() or ddply()
dat2<-aggregate(dat,list(dat$stock_symbol,dat$tdate),FUN=function(x) head(x,1))
?dat2[,3:6]
#????? tdate stock_symbol expiration strike
#1 9/11/2012??????????? C? 9/16/2012???? 11
#2 9/12/2012??????????? C? 9/16/2012???? 14
library(plyr)
?ddply(dat,.(stock_symbol,tdate), function(x) head(x,1))
#????? tdate stock_symbol expiration strike
#1 9/11/2012??????????? C? 9/16/2012???? 11
#2 9/12/2012??????????? C? 9/16/2012???? 14
A.K.
----- Original Message -----
From: David Winsemius <dwinsemius at comcast.net>
To: ramoss <ramine.mossadegh at finra.org>
Cc: r-help at r-project.org
Sent: Wednesday, October 10, 2012 5:42 PM
Subject: Re: [R] How to replicate SAS by group processing in R
On Oct 10, 2012, at 11:09 AM, ramoss wrote:
> Hello,
>
> I am trying to re-code all my programs from SAS into R.
>
> In SAS I use the following code:
>
> proc sort data=upper;
> by tdate stock_symbol expire? strike;
> run;
> data upper1;
>? set upper;
>? by tdate stock_symbol expire? strike;
I must have forgotten my SAS. (It was a lng time ago I will admit.)? Would that have succeeded with the inclusion of 'strike' in that 'by' list?
>? if first.expire then output;
>? rename strike=astrike;
> run;
>
> on the following data set:
>
> tdate??? stock_symbol??? expiration ??? strike
> 9/11/2012??? C??? ? ? ? ? ? ? ? ? ? 9/16/2012??? 11
> 9/11/2012??? C??? ? ? ? ? ? ? ? ? ? 9/16/2012??? 12
> 9/11/2012??? C??? ? ? ? ? ? ? ? ? ? 9/16/2012??? 13
> 9/12/2012??? C??? ? ? ? ? ? ? ? ? ? 9/16/2012??? 14
> 9/12/2012??? C??? ? ? ? ? ? ? ? ? ? 9/16/2012??? 15
> 9/12/2012??? C??? ? ? ? ? ? ? ? ? ? 9/16/2012??? 16
> 9/12/2012??? C??? ? ? ? ? ? ? ? ? ? ? 9/16/2012??? 17
>
> to get the following results:
> tdate??? stock_symbol??? expiration ??? strike
> 9/11/2012??? C??? ? ? ? ? ? ? ? ? ? 9/16/2012??? 11
> 9/12/2012??? C??? ? ? ? ? ? ? ? ? ? 9/16/2012??? 14
> dat[tapply(1:nrow(dat), list( dat$stock_symbol,? dat$tdate), FUN= function(x) head(x,1) ), ]
? ? ? tdate stock_symbol expiration strike
1 9/11/2012? ? ? ? ? ? C? 9/16/2012? ? 11
4 9/12/2012? ? ? ? ? ? C? 9/16/2012? ? 14
>
>
> How would I replicate this kind of logic in R?
> I have seen PLY & data.table packages mentioned but don't see how they would
> do the job.
You must mean the 'plyr' package;? there is no "PLY'. I'm sure the 'ddply' function or data.table could do this.
Here's another way with the R 'by' function which is then row-bound using 'do.call':
> do.call( rbind, by(dat, list( dat$stock_symbol,? dat$tdate), FUN= function(x) head(x,1) ) )
? ? ? tdate stock_symbol expiration strike
1 9/11/2012? ? ? ? ? ? C? 9/16/2012? ? 11
4 9/12/2012? ? ? ? ? ? C? 9/16/2012? ? 14
--
David Winsemius, MD
Alameda, CA, USA
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.