Using cumsum with 'group by' ?
HI, If that is the case, this should work: dat1<-read.table(text=" id,????????? x,????????? date 1,????????? 5,????????? 2012-06-05 12:01 1,????????? 10,??????? 2012-06-05 12:02 1,????????? 45,??????? 2012-06-05 12:03 2,????????? 5,????????? 2012-06-05 12:01 2,????????? 3,????????? 2012-06-05 12:03 2,????????? 2,????????? 2012-06-05 12:05 3,????????? 5,????????? 2012-06-05 12:03 3,????????? 5,????????? 2012-06-05 12:04 3,????????? 8,????????? 2012-06-05 12:05 1,????????? 5,????????? 2012-06-08 13:01 1,????????? 9,????????? 2012-06-08 13:02 1,????????? 3,????????? 2012-06-08 13:03 2,????????? 0,????????? 2012-06-08 13:15 2,????????? 1,????????? 2012-06-08 13:18 2,????????? 8,????????? 2012-06-08 13:20 2,????????? 4,????????? 2012-06-08 13:21 3,????????? 6,????????? 2012-06-08 13:15 3,????????? 2,????????? 2012-06-08 13:16 3,????????? 7,????????? 2012-06-08 13:17 3,????????? 2,????????? 2012-06-08 13:18 ",sep=",",header=TRUE,stringsAsFactors=FALSE) dat1$date<-as.Date(dat1$date,format="%Y-%m-%d %H:%M") ?dat2<-dat1[order(dat1[,1],dat1[,3]),] ?dat2$Cumsum<-ave(dat2$x,list(dat2$id,dat2$date),FUN=cumsum) head(dat2) #?? id? x?????? date Cumsum #1?? 1? 5 2012-06-05????? 5 #2?? 1 10 2012-06-05???? 15 #3?? 1 45 2012-06-05???? 60 #10? 1? 5 2012-06-08????? 5 #11? 1? 9 2012-06-08???? 14 #12? 1? 3 2012-06-08???? 17 #or with(dat2,aggregate(x,by=list(id=id,date=date),cumsum)) #? id?????? date??????????? x #1? 1 2012-06-05??? 5, 15, 60 #2? 2 2012-06-05???? 5, 8, 10 #3? 3 2012-06-05??? 5, 10, 18 #4? 1 2012-06-08??? 5, 14, 17 #5? 2 2012-06-08? 0, 1, 9, 13 #6? 3 2012-06-08 6, 8, 15, 17 A.K. ----- Original Message ----- From: TheRealJimShady <james.david.smith at gmail.com> To: r-help at r-project.org Cc: Sent: Friday, November 23, 2012 6:04 AM Subject: Re: [R] Using cumsum with 'group by' ? Hi Arun & everyone, Thank you very much for your helpful suggestions. I've been working through them, but have realised that my data is a little more complicated than I said and that the solutions you've kindly provided don't work. The problem is that there is more than one day of data for each person. It looks like this: id? ? ? ? ? x? ? ? ? ? date 1? ? ? ? ? 5? ? ? ? ? 2012-06-05 12:01 1? ? ? ? ? 10? ? ? ? 2012-06-05 12:02 1? ? ? ? ? 45? ? ? ? 2012-06-05 12:03 2? ? ? ? ? 5? ? ? ? ? 2012-06-05 12:01 2? ? ? ? ? 3? ? ? ? ? 2012-06-05 12:03 2? ? ? ? ? 2? ? ? ? ? 2012-06-05 12:05 3? ? ? ? ? 5? ? ? ? ? 2012-06-05 12:03 3? ? ? ? ? 5? ? ? ? ? 2012-06-05 12:04 3? ? ? ? ? 8? ? ? ? ? 2012-06-05 12:05 1? ? ? ? ? 5? ? ? ? ? 2012-06-08 13:01 1? ? ? ? ? 9? ? ? ? ? 2012-06-08 13:02 1? ? ? ? ? 3? ? ? ? ? 2012-06-08 13:03 2? ? ? ? ? 0? ? ? ? ? 2012-06-08 13:15 2? ? ? ? ? 1? ? ? ? ? 2012-06-08 13:18 2? ? ? ? ? 8? ? ? ? ? 2012-06-08 13:20 2? ? ? ? ? 4? ? ? ? ? 2012-06-08 13:21 3? ? ? ? ? 6? ? ? ? ? 2012-06-08 13:15 3? ? ? ? ? 2? ? ? ? ? 2012-06-08 13:16 3? ? ? ? ? 7? ? ? ? ? 2012-06-08 13:17 3? ? ? ? ? 2? ? ? ? ? 2012-06-08 13:18 So what I need to do is something like this (in pseudo code anyway): - Order the data by the id field and then the date field - add a new variable called cumsum - calculate this variable as the cumulative value of X, but grouping by the id and date (not date, not date and time). Thank you James On 23 November 2012 03:54, arun kirshna [via R]
<ml-node+s789695n4650505h81 at n4.nabble.com> wrote:
Hi, No problem. One more method if you wanted to try: library(data.table) dat2<-data.table(dat1) dat2[,list(x,time,Cumsum=cumsum(x)),list(id)] ? #? id? x? time Cumsum ? #1:? 1? 5 12:01? ? ? 5 ? #2:? 1 14 12:02? ? 19 ? #3:? 1? 6 12:03? ? 25 ? #4:? 1? 3 12:04? ? 28 ? #5:? 2 98 12:01? ? 98 ? #6:? 2 23 12:02? ? 121 ? #7:? 2? 1 12:03? ? 122 ? #8:? 2? 4 12:04? ? 126 ? #9:? 3? 5 12:01? ? ? 5 #10:? 3 65 12:02? ? 70 #11:? 3 23 12:03? ? 93 #12:? 3 23 12:04? ? 116 A.K. ----- Original Message ----- From: TheRealJimShady <[hidden email]> To: [hidden email] Cc: Sent: Thursday, November 22, 2012 12:27 PM Subject: Re: [R] Using cumsum with 'group by' ? Thank you very much, I will try these tomorrow morning. On 22 November 2012 17:25, arun kirshna [via R] <[hidden email]> wrote:
HI, You can do this in many ways: dat1<-read.table(text=" id? ? time? ? x 1? 12:01? ? 5 1? 12:02? 14 1? 12:03? 6 1? 12:04? 3 2? 12:01? 98 2? 12:02? 23 2? 12:03? 1 2? 12:04? 4 3? 12:01? 5 3? 12:02? 65 3? 12:03? 23 3? 12:04? 23 ",sep="",header=TRUE,stringsAsFactors=FALSE) ? dat1$Cumsum<-ave(dat1$x,dat1$id,FUN=cumsum) #or ? unlist(tapply(dat1$x,dat1$id,FUN=cumsum),use.names=FALSE) # [1]? 5? 19? 25? 28? 98 121 122 126? 5? 70? 93 116 #or library(plyr) ? ddply(dat1,.(id),function(x) cumsum(x[3]))[,2] # [1]? 5? 19? 25? 28? 98 121 122 126? 5? 70? 93 116 head(dat1) #? id? time? x Cumsum #1? 1 12:01? 5? ? ? 5 #2? 1 12:02 14? ? 19 #3? 1 12:03? 6? ? 25 #4? 1 12:04? 3? ? 28 #5? 2 12:01 98? ? 98 #6? 2 12:02 23? ? 121 A.K.
________________________________ If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650459.html To unsubscribe from Using cumsum with 'group by' ?, click here. NAML
-- View this message in context: http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650461.html Sent from the R help mailing list archive at Nabble.com. ? ? [[alternative HTML version deleted]]
______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ________________________________ If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650505.html To unsubscribe from Using cumsum with 'group by' ?, click here. NAML
-- View this message in context: http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650538.html Sent from the R help mailing list archive at Nabble.com. ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.