Using cumsum with 'group by' ?
On Nov 23, 2012, at 12:04 , TheRealJimShady wrote:
Hi Arun & everyone, Thank you very much for your helpful suggestions. I've been working through them, but have realised that my data is a little more complicated than I said and that the solutions you've kindly provided don't work. The problem is that there is more than one day of data for each person. It looks like this: id x date 1 5 2012-06-05 12:01 1 10 2012-06-05 12:02 1 45 2012-06-05 12:03 2 5 2012-06-05 12:01 2 3 2012-06-05 12:03 2 2 2012-06-05 12:05 3 5 2012-06-05 12:03 3 5 2012-06-05 12:04 3 8 2012-06-05 12:05 1 5 2012-06-08 13:01 1 9 2012-06-08 13:02 1 3 2012-06-08 13:03 2 0 2012-06-08 13:15 2 1 2012-06-08 13:18 2 8 2012-06-08 13:20 2 4 2012-06-08 13:21 3 6 2012-06-08 13:15 3 2 2012-06-08 13:16 3 7 2012-06-08 13:17 3 2 2012-06-08 13:18 So what I need to do is something like this (in pseudo code anyway): - Order the data by the id field and then the date field - add a new variable called cumsum - calculate this variable as the cumulative value of X, but grouping by the id and date (not date, not date and time).
Did you miss an 'e' on the first 'not'?? Otherwise, I'm confused. Why do people always forget about ave()? I'd try newdata <- transform(mydata, csum=ave(x, id, as.Date(date), FUN=cumsum)) You still need to sort, of course, at least by date (incl. time). Check carefully whether time zone is an issue for as.Date --- you may need the tz argument. Also notice that I shy using an R function name as a variable name. This is mostly superstition these days, but better safe than sorry.
Thank you James On 23 November 2012 03:54, arun kirshna [via R] <ml-node+s789695n4650505h81 at n4.nabble.com> wrote:
Hi, No problem. One more method if you wanted to try: library(data.table) dat2<-data.table(dat1) dat2[,list(x,time,Cumsum=cumsum(x)),list(id)] # id x time Cumsum #1: 1 5 12:01 5 #2: 1 14 12:02 19 #3: 1 6 12:03 25 #4: 1 3 12:04 28 #5: 2 98 12:01 98 #6: 2 23 12:02 121 #7: 2 1 12:03 122 #8: 2 4 12:04 126 #9: 3 5 12:01 5 #10: 3 65 12:02 70 #11: 3 23 12:03 93 #12: 3 23 12:04 116 A.K. ----- Original Message ----- From: TheRealJimShady <[hidden email]> To: [hidden email] Cc: Sent: Thursday, November 22, 2012 12:27 PM Subject: Re: [R] Using cumsum with 'group by' ? Thank you very much, I will try these tomorrow morning. On 22 November 2012 17:25, arun kirshna [via R] <[hidden email]> wrote:
HI, You can do this in many ways: dat1<-read.table(text=" id time x 1 12:01 5 1 12:02 14 1 12:03 6 1 12:04 3 2 12:01 98 2 12:02 23 2 12:03 1 2 12:04 4 3 12:01 5 3 12:02 65 3 12:03 23 3 12:04 23 ",sep="",header=TRUE,stringsAsFactors=FALSE) dat1$Cumsum<-ave(dat1$x,dat1$id,FUN=cumsum) #or unlist(tapply(dat1$x,dat1$id,FUN=cumsum),use.names=FALSE) # [1] 5 19 25 28 98 121 122 126 5 70 93 116 #or library(plyr) ddply(dat1,.(id),function(x) cumsum(x[3]))[,2] # [1] 5 19 25 28 98 121 122 126 5 70 93 116 head(dat1) # id time x Cumsum #1 1 12:01 5 5 #2 1 12:02 14 19 #3 1 12:03 6 25 #4 1 12:04 3 28 #5 2 12:01 98 98 #6 2 12:02 23 121 A.K.
________________________________ If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650459.html To unsubscribe from Using cumsum with 'group by' ?, click here. NAML
-- View this message in context: http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650461.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]]
______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ________________________________ If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650505.html To unsubscribe from Using cumsum with 'group by' ?, click here. NAML
-- View this message in context: http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650538.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com