Skip to content
Prev 311691 / 398502 Next

Using cumsum with 'group by' ?

HI,

If that is the case, this should work:
dat1<-read.table(text="
id,????????? x,????????? date
1,????????? 5,????????? 2012-06-05 12:01
1,????????? 10,??????? 2012-06-05 12:02
1,????????? 45,??????? 2012-06-05 12:03
2,????????? 5,????????? 2012-06-05 12:01
2,????????? 3,????????? 2012-06-05 12:03
2,????????? 2,????????? 2012-06-05 12:05
3,????????? 5,????????? 2012-06-05 12:03
3,????????? 5,????????? 2012-06-05 12:04
3,????????? 8,????????? 2012-06-05 12:05
1,????????? 5,????????? 2012-06-08 13:01
1,????????? 9,????????? 2012-06-08 13:02
1,????????? 3,????????? 2012-06-08 13:03
2,????????? 0,????????? 2012-06-08 13:15
2,????????? 1,????????? 2012-06-08 13:18
2,????????? 8,????????? 2012-06-08 13:20
2,????????? 4,????????? 2012-06-08 13:21
3,????????? 6,????????? 2012-06-08 13:15
3,????????? 2,????????? 2012-06-08 13:16
3,????????? 7,????????? 2012-06-08 13:17
3,????????? 2,????????? 2012-06-08 13:18
",sep=",",header=TRUE,stringsAsFactors=FALSE)
dat1$date<-as.Date(dat1$date,format="%Y-%m-%d %H:%M")
?dat2<-dat1[order(dat1[,1],dat1[,3]),]
?dat2$Cumsum<-ave(dat2$x,list(dat2$id,dat2$date),FUN=cumsum)

head(dat2)
#?? id? x?????? date Cumsum
#1?? 1? 5 2012-06-05????? 5
#2?? 1 10 2012-06-05???? 15
#3?? 1 45 2012-06-05???? 60
#10? 1? 5 2012-06-08????? 5
#11? 1? 9 2012-06-08???? 14
#12? 1? 3 2012-06-08???? 17
#or
with(dat2,aggregate(x,by=list(id=id,date=date),cumsum))
#? id?????? date??????????? x
#1? 1 2012-06-05??? 5, 15, 60
#2? 2 2012-06-05???? 5, 8, 10
#3? 3 2012-06-05??? 5, 10, 18
#4? 1 2012-06-08??? 5, 14, 17
#5? 2 2012-06-08? 0, 1, 9, 13
#6? 3 2012-06-08 6, 8, 15, 17
A.K.



----- Original Message -----
From: TheRealJimShady <james.david.smith at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Friday, November 23, 2012 6:04 AM
Subject: Re: [R] Using cumsum with 'group by' ?

Hi Arun & everyone,

Thank you very much for your helpful suggestions. I've been working
through them, but have realised that my data is a little more
complicated than I said and that the solutions you've kindly provided
don't work. The problem is that there is more than one day of data for
each person. It looks like this:

id? ? ? ? ? x? ? ? ? ? date
1? ? ? ? ? 5? ? ? ? ? 2012-06-05 12:01
1? ? ? ? ? 10? ? ? ? 2012-06-05 12:02
1? ? ? ? ? 45? ? ? ? 2012-06-05 12:03
2? ? ? ? ? 5? ? ? ? ? 2012-06-05 12:01
2? ? ? ? ? 3? ? ? ? ? 2012-06-05 12:03
2? ? ? ? ? 2? ? ? ? ? 2012-06-05 12:05
3? ? ? ? ? 5? ? ? ? ? 2012-06-05 12:03
3? ? ? ? ? 5? ? ? ? ? 2012-06-05 12:04
3? ? ? ? ? 8? ? ? ? ? 2012-06-05 12:05
1? ? ? ? ? 5? ? ? ? ? 2012-06-08 13:01
1? ? ? ? ? 9? ? ? ? ? 2012-06-08 13:02
1? ? ? ? ? 3? ? ? ? ? 2012-06-08 13:03
2? ? ? ? ? 0? ? ? ? ? 2012-06-08 13:15
2? ? ? ? ? 1? ? ? ? ? 2012-06-08 13:18
2? ? ? ? ? 8? ? ? ? ? 2012-06-08 13:20
2? ? ? ? ? 4? ? ? ? ? 2012-06-08 13:21
3? ? ? ? ? 6? ? ? ? ? 2012-06-08 13:15
3? ? ? ? ? 2? ? ? ? ? 2012-06-08 13:16
3? ? ? ? ? 7? ? ? ? ? 2012-06-08 13:17
3? ? ? ? ? 2? ? ? ? ? 2012-06-08 13:18

So what I need to do is something like this (in pseudo code anyway):

- Order the data by the id field and then the date field
- add a new variable called cumsum
- calculate this variable as the cumulative value of X, but grouping
by the id and date (not date, not date and time).

Thank you

James





On 23 November 2012 03:54, arun kirshna [via R]
<ml-node+s789695n4650505h81 at n4.nabble.com> wrote:
--
View this message in context: http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650538.html
Sent from the R help mailing list archive at Nabble.com.
??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.