Hello, I am trying to re-code all my programs from SAS into R. In SAS I use the following code: proc sort data=upper; by tdate stock_symbol expire strike; run; data upper1; set upper; by tdate stock_symbol expire strike; if first.expire then output; rename strike=astrike; run; on the following data set: tdate stock_symbol expiration strike 9/11/2012 C 9/16/2012 11 9/11/2012 C 9/16/2012 12 9/11/2012 C 9/16/2012 13 9/12/2012 C 9/16/2012 14 9/12/2012 C 9/16/2012 15 9/12/2012 C 9/16/2012 16 9/12/2012 C 9/16/2012 17 to get the following results: tdate stock_symbol expiration strike 9/11/2012 C 9/16/2012 11 9/12/2012 C 9/16/2012 14 How would I replicate this kind of logic in R? I have seen PLY & data.table packages mentioned but don't see how they would do the job. Thanks ahead for your help -- View this message in context: http://r.789695.n4.nabble.com/How-to-replicate-SAS-by-group-processing-in-R-tp4645753.html Sent from the R help mailing list archive at Nabble.com.
Hi,
here is one way using ddply (from the plyr package):
dat <- read.table(text="tdate stock_symbol expiration strike
9/11/2012 C 9/16/2012 11
9/11/2012 C 9/16/2012 12
9/11/2012 C 9/16/2012 13
9/12/2012 C 9/16/2012 14
9/12/2012 C 9/16/2012 15
9/12/2012 C 9/16/2012 16
9/12/2012 C 9/16/2012 17",
header=TRUE)
library(plyr)
ddply(dat, .variables = c("tdate"), .fun = function(x) x[1, ])
Best,
Ista
On Wed, Oct 10, 2012 at 2:09 PM, ramoss <ramine.mossadegh at finra.org>
wrote:> Hello,
>
> I am trying to re-code all my programs from SAS into R.
>
> In SAS I use the following code:
>
> proc sort data=upper;
> by tdate stock_symbol expire strike;
> run;
> data upper1;
> set upper;
> by tdate stock_symbol expire strike;
> if first.expire then output;
> rename strike=astrike;
> run;
>
> on the following data set:
>
> tdate stock_symbol expiration strike
> 9/11/2012 C 9/16/2012 11
> 9/11/2012 C 9/16/2012 12
> 9/11/2012 C 9/16/2012 13
> 9/12/2012 C 9/16/2012 14
> 9/12/2012 C 9/16/2012 15
> 9/12/2012 C 9/16/2012 16
> 9/12/2012 C 9/16/2012 17
>
> to get the following results:
> tdate stock_symbol expiration strike
> 9/11/2012 C 9/16/2012 11
> 9/12/2012 C 9/16/2012 14
>
> How would I replicate this kind of logic in R?
> I have seen PLY & data.table packages mentioned but don't see how
they would
> do the job.
>
> Thanks ahead for your help
>
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/How-to-replicate-SAS-by-group-processing-in-R-tp4645753.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
On Oct 10, 2012, at 11:09 AM, ramoss wrote:> Hello, > > I am trying to re-code all my programs from SAS into R. > > In SAS I use the following code: > > proc sort data=upper; > by tdate stock_symbol expire strike; > run; > data upper1; > set upper; > by tdate stock_symbol expire strike;I must have forgotten my SAS. (It was a lng time ago I will admit.) Would that have succeeded with the inclusion of 'strike' in that 'by' list?> if first.expire then output; > rename strike=astrike; > run; > > on the following data set: > > tdate stock_symbol expiration strike > 9/11/2012 C 9/16/2012 11 > 9/11/2012 C 9/16/2012 12 > 9/11/2012 C 9/16/2012 13 > 9/12/2012 C 9/16/2012 14 > 9/12/2012 C 9/16/2012 15 > 9/12/2012 C 9/16/2012 16 > 9/12/2012 C 9/16/2012 17 > > to get the following results: > tdate stock_symbol expiration strike > 9/11/2012 C 9/16/2012 11 > 9/12/2012 C 9/16/2012 14> dat[tapply(1:nrow(dat), list( dat$stock_symbol, dat$tdate), FUN= function(x) head(x,1) ), ]tdate stock_symbol expiration strike 1 9/11/2012 C 9/16/2012 11 4 9/12/2012 C 9/16/2012 14>> > How would I replicate this kind of logic in R? > I have seen PLY & data.table packages mentioned but don't see how they would > do the job.You must mean the 'plyr' package; there is no "PLY'. I'm sure the 'ddply' function or data.table could do this. Here's another way with the R 'by' function which is then row-bound using 'do.call':> do.call( rbind, by(dat, list( dat$stock_symbol, dat$tdate), FUN= function(x) head(x,1) ) )tdate stock_symbol expiration strike 1 9/11/2012 C 9/16/2012 11 4 9/12/2012 C 9/16/2012 14 -- David Winsemius, MD Alameda, CA, USA
On Wed, Oct 10, 2012 at 7:09 PM, ramoss <ramine.mossadegh at finra.org> wrote:> In SAS I use the following code: > > proc sort data=upper; > by tdate stock_symbol expire strike; > run; > data upper1; > set upper; > by tdate stock_symbol expire strike; > if first.expire then output; > rename strike=astrike; > run; > > on the following data set: > > tdate stock_symbol expiration strike > 9/11/2012 C 9/16/2012 11 > 9/11/2012 C 9/16/2012 12 > 9/11/2012 C 9/16/2012 13 > 9/12/2012 C 9/16/2012 14 > 9/12/2012 C 9/16/2012 15 > 9/12/2012 C 9/16/2012 16 > 9/12/2012 C 9/16/2012 17 > > to get the following results: > tdate stock_symbol expiration strike > 9/11/2012 C 9/16/2012 11 > 9/12/2012 C 9/16/2012 14 > > How would I replicate this kind of logic in R?First, replicate it in some kind of universally understood language - like English. Nearly every alien in every sci-fi film I've seen speaks English, so that's a safe assumption :) What does it do? Take the first record within groups defined by tdate? Why does your code say 'expire' but the data have 'expiration'? Barry