Message-ID: <1353682797.55994.YahooMailNeo@web142606.mail.bf1.yahoo.com>
Date: 2012-11-23T14:59:57Z
From: arun
Subject: Using cumsum with 'group by' ?
In-Reply-To: <CAMu32AAnypr0ztcsVu+t8cijNf=Pnzc9pnygPjE1BLDBxEcRWw@mail.gmail.com>
HI,
If that is the case, this should work:
dat1<-read.table(text="
id,????????? x,????????? date
1,????????? 5,????????? 2012-06-05 12:01
1,????????? 10,??????? 2012-06-05 12:02
1,????????? 45,??????? 2012-06-05 12:03
2,????????? 5,????????? 2012-06-05 12:01
2,????????? 3,????????? 2012-06-05 12:03
2,????????? 2,????????? 2012-06-05 12:05
3,????????? 5,????????? 2012-06-05 12:03
3,????????? 5,????????? 2012-06-05 12:04
3,????????? 8,????????? 2012-06-05 12:05
1,????????? 5,????????? 2012-06-08 13:01
1,????????? 9,????????? 2012-06-08 13:02
1,????????? 3,????????? 2012-06-08 13:03
2,????????? 0,????????? 2012-06-08 13:15
2,????????? 1,????????? 2012-06-08 13:18
2,????????? 8,????????? 2012-06-08 13:20
2,????????? 4,????????? 2012-06-08 13:21
3,????????? 6,????????? 2012-06-08 13:15
3,????????? 2,????????? 2012-06-08 13:16
3,????????? 7,????????? 2012-06-08 13:17
3,????????? 2,????????? 2012-06-08 13:18
",sep=",",header=TRUE,stringsAsFactors=FALSE)
dat1$date<-as.Date(dat1$date,format="%Y-%m-%d %H:%M")
?dat2<-dat1[order(dat1[,1],dat1[,3]),]
?dat2$Cumsum<-ave(dat2$x,list(dat2$id,dat2$date),FUN=cumsum)
head(dat2)
#?? id? x?????? date Cumsum
#1?? 1? 5 2012-06-05????? 5
#2?? 1 10 2012-06-05???? 15
#3?? 1 45 2012-06-05???? 60
#10? 1? 5 2012-06-08????? 5
#11? 1? 9 2012-06-08???? 14
#12? 1? 3 2012-06-08???? 17
#or
with(dat2,aggregate(x,by=list(id=id,date=date),cumsum))
#? id?????? date??????????? x
#1? 1 2012-06-05??? 5, 15, 60
#2? 2 2012-06-05???? 5, 8, 10
#3? 3 2012-06-05??? 5, 10, 18
#4? 1 2012-06-08??? 5, 14, 17
#5? 2 2012-06-08? 0, 1, 9, 13
#6? 3 2012-06-08 6, 8, 15, 17
A.K.
----- Original Message -----
From: TheRealJimShady <james.david.smith at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Friday, November 23, 2012 6:04 AM
Subject: Re: [R] Using cumsum with 'group by' ?
Hi Arun & everyone,
Thank you very much for your helpful suggestions. I've been working
through them, but have realised that my data is a little more
complicated than I said and that the solutions you've kindly provided
don't work. The problem is that there is more than one day of data for
each person. It looks like this:
id? ? ? ? ? x? ? ? ? ? date
1? ? ? ? ? 5? ? ? ? ? 2012-06-05 12:01
1? ? ? ? ? 10? ? ? ? 2012-06-05 12:02
1? ? ? ? ? 45? ? ? ? 2012-06-05 12:03
2? ? ? ? ? 5? ? ? ? ? 2012-06-05 12:01
2? ? ? ? ? 3? ? ? ? ? 2012-06-05 12:03
2? ? ? ? ? 2? ? ? ? ? 2012-06-05 12:05
3? ? ? ? ? 5? ? ? ? ? 2012-06-05 12:03
3? ? ? ? ? 5? ? ? ? ? 2012-06-05 12:04
3? ? ? ? ? 8? ? ? ? ? 2012-06-05 12:05
1? ? ? ? ? 5? ? ? ? ? 2012-06-08 13:01
1? ? ? ? ? 9? ? ? ? ? 2012-06-08 13:02
1? ? ? ? ? 3? ? ? ? ? 2012-06-08 13:03
2? ? ? ? ? 0? ? ? ? ? 2012-06-08 13:15
2? ? ? ? ? 1? ? ? ? ? 2012-06-08 13:18
2? ? ? ? ? 8? ? ? ? ? 2012-06-08 13:20
2? ? ? ? ? 4? ? ? ? ? 2012-06-08 13:21
3? ? ? ? ? 6? ? ? ? ? 2012-06-08 13:15
3? ? ? ? ? 2? ? ? ? ? 2012-06-08 13:16
3? ? ? ? ? 7? ? ? ? ? 2012-06-08 13:17
3? ? ? ? ? 2? ? ? ? ? 2012-06-08 13:18
So what I need to do is something like this (in pseudo code anyway):
- Order the data by the id field and then the date field
- add a new variable called cumsum
- calculate this variable as the cumulative value of X, but grouping
by the id and date (not date, not date and time).
Thank you
James
On 23 November 2012 03:54, arun kirshna [via R]
<ml-node+s789695n4650505h81 at n4.nabble.com> wrote:
> Hi,
> No problem.
> One more method if you wanted to try:
> library(data.table)
> dat2<-data.table(dat1)
> dat2[,list(x,time,Cumsum=cumsum(x)),list(id)]
>? #? id? x? time Cumsum
>? #1:? 1? 5 12:01? ? ? 5
>? #2:? 1 14 12:02? ? 19
>? #3:? 1? 6 12:03? ? 25
>? #4:? 1? 3 12:04? ? 28
>? #5:? 2 98 12:01? ? 98
>? #6:? 2 23 12:02? ? 121
>? #7:? 2? 1 12:03? ? 122
>? #8:? 2? 4 12:04? ? 126
>? #9:? 3? 5 12:01? ? ? 5
> #10:? 3 65 12:02? ? 70
> #11:? 3 23 12:03? ? 93
> #12:? 3 23 12:04? ? 116
>
>
> A.K.
>
>
>
> ----- Original Message -----
> From: TheRealJimShady <[hidden email]>
> To: [hidden email]
> Cc:
> Sent: Thursday, November 22, 2012 12:27 PM
> Subject: Re: [R] Using cumsum with 'group by' ?
>
> Thank you very much, I will try these tomorrow morning.
>
> On 22 November 2012 17:25, arun kirshna [via R]
> <[hidden email]> wrote:
>
>> HI,
>> You can do this in many ways:
>> dat1<-read.table(text="
>> id? ? time? ? x
>> 1? 12:01? ? 5
>> 1? 12:02? 14
>> 1? 12:03? 6
>> 1? 12:04? 3
>> 2? 12:01? 98
>> 2? 12:02? 23
>> 2? 12:03? 1
>> 2? 12:04? 4
>> 3? 12:01? 5
>> 3? 12:02? 65
>> 3? 12:03? 23
>> 3? 12:04? 23
>> ",sep="",header=TRUE,stringsAsFactors=FALSE)
>>? dat1$Cumsum<-ave(dat1$x,dat1$id,FUN=cumsum)
>> #or
>>? unlist(tapply(dat1$x,dat1$id,FUN=cumsum),use.names=FALSE)
>> # [1]? 5? 19? 25? 28? 98 121 122 126? 5? 70? 93 116
>> #or
>> library(plyr)
>>? ddply(dat1,.(id),function(x) cumsum(x[3]))[,2]
>> # [1]? 5? 19? 25? 28? 98 121 122 126? 5? 70? 93 116
>> head(dat1)
>> #? id? time? x Cumsum
>> #1? 1 12:01? 5? ? ? 5
>> #2? 1 12:02 14? ? 19
>> #3? 1 12:03? 6? ? 25
>> #4? 1 12:04? 3? ? 28
>> #5? 2 12:01 98? ? 98
>> #6? 2 12:02 23? ? 121
>> A.K.
>>
>>
>>
>>
>> ________________________________
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650459.html
>> To unsubscribe from Using cumsum with 'group by' ?, click here.
>> NAML
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650461.html
> Sent from the R help mailing list archive at Nabble.com.
>? ? [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650505.html
> To unsubscribe from Using cumsum with 'group by' ?, click here.
> NAML
--
View this message in context: http://r.789695.n4.nabble.com/Using-cumsum-with-group-by-tp4650457p4650538.html
Sent from the R help mailing list archive at Nabble.com.
??? [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.