Skip to content

Hello R User

5 messages · bibek sharma, arun, Jessica Streicher +1 more

#
Hello R User,
In the sample data given below, time is recorded for each id
subsequently. For the analysis, for each id, I would like to set 1st
recorded time to zero and thereafter find the difference from previous
time. I.e. for ID==1, I would like to see Time=0,3,1,3,6. This needs
to be implemented to big data set.
Any suggestions are much appreciated!
Thanks,
Bibek

ID	Time
1	3
1	6
1	7
1	10
1	16
2	12
2	18
2	19
2	25
2	28
2	30
#
HI,
Try this:
dat1<-read.table(text="
ID??? Time
1??? 3
1??? 6
1??? 7
1??? 10
1??? 16
2??? 12
2??? 18
2??? 19
2??? 25
2??? 28
2??? 30
",sep="",header=TRUE)
?dat1$Time1<-ave(dat1$Time,dat1$ID,FUN=function(x) c(0,diff(x)))
head(dat1,3)
#? ID Time Time1
#1? 1??? 3???? 0
#2? 1??? 6???? 3
#3? 1??? 7???? 1

#or
dat2<-unsplit(lapply(split(dat1,dat1$ID),function(x) {x$Time<-c(0,diff(x[,2])); return(x)}),dat1$ID)
head(dat2,3)
#? ID Time
#1? 1??? 0
#2? 1??? 3
#3? 1??? 1
A.K.




----- Original Message -----
From: bibek sharma <mbhpathak at gmail.com>
To: R help <r-help at r-project.org>
Cc: 
Sent: Friday, December 14, 2012 10:51 AM
Subject: [R] Hello R User

Hello R User,
In the sample data given below, time is recorded for each id
subsequently. For the analysis, for each id, I would like to set 1st
recorded time to zero and thereafter find the difference from previous
time. I.e. for ID==1, I would like to see Time=0,3,1,3,6. This needs
to be implemented to big data set.
Any suggestions are much appreciated!
Thanks,
Bibek

ID??? Time
1??? 3
1??? 6
1??? 7
1??? 10
1??? 16
2??? 12
2??? 18
2??? 19
2??? 25
2??? 28
2??? 30

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
dataset<-data.frame(id=c(1,1,2,3,3,3),time=c(3,5,1,2,4,6))
 dataset
  id time
1  1    3
2  1    5
3  2    1
4  3    2
5  3    4
6  3    6
 ids<-unique(dataset$id)
 for(id in ids){
+ 	dataset$time[dataset$id==id]<-c(0,diff(dataset$time[dataset$id==id]))
+ }
 dataset
  id time
1  1    0
2  1    2
3  2    0
4  3    0
5  3    2
6  3    2

might not be the fastest though.
On 14.12.2012, at 16:51, bibek sharma wrote:

            
#
Hi Bibek,
how about this?

dta<-read.table(textConnection("ID	Time
1	3
1	6
1	7
1	10
1	16
2	12
2	18
2	19
2	25
2	28
2	30"),header=T)

dta$delta<-with(dta,ave(Time,ID,FUN=function(x)c(0,diff(x))))
dta

hth.

Am 14.12.2012 16:51, schrieb bibek sharma:
#
Hi,

You could also use library(data.table) to do this faster.
dat1<-read.table(text="
ID??? Time
1??? 3
1??? 6
1??? 7
1??? 10
1??? 16
2??? 12
2??? 18
2??? 19
2??? 25
2??? 28
2??? 30
",sep="",header=TRUE)
library(data.table)
dat2<-data.table(dat1)
res<-dat2[,Time1:=c(0,diff(Time)),by=ID]
?head(res,3)
?#? ID Time Time1
#1:? 1??? 3???? 0
#2:? 1??? 6???? 3
#3:? 1??? 7???? 1

#Comparing different approaches:
set.seed(55)
dat3<- data.frame(ID=rep(1:1000,each=500),Value=sample(1:800,5e5,replace=TRUE))
dat4<-data.table(dat3)
system.time(dat3$Value1<-ave(dat3$Value,dat3$ID,FUN=function(x) c(0,diff(x))))
#?? user? system elapsed 
?# 0.312?? 0.000?? 0.313 

ids<-unique(dat3$ID)
?system.time({
?? for(id in ids){
?? dat3$Value[dat3$ID==id]<-c(0,diff(dat3$Value[dat3$ID==id]))
?? } })
#?? user? system elapsed 
# 36.938?? 0.868? 37.873 

system.time(dat5<-dat4[,Value1:=c(0,diff(Value)),by=ID])
#?? user? system elapsed 
?# 0.036?? 0.000?? 0.037 
head(dat5)
#?? ID Value Value1
#1:? 1?? 439????? 0
#2:? 1?? 175?? -264
#3:? 1??? 28?? -147
#4:? 1?? 634??? 606
#5:? 1?? 449?? -185
#6:? 1??? 60?? -389
?head(dat3)
#? ID Value Value1
#1? 1???? 0????? 0
#2? 1? -264?? -264
#3? 1? -147?? -147
#4? 1?? 606??? 606
#5? 1? -185?? -185
#6? 1? -389?? -389

A.K.












----- Original Message -----
From: bibek sharma <mbhpathak at gmail.com>
To: R help <r-help at r-project.org>
Cc: 
Sent: Friday, December 14, 2012 10:51 AM
Subject: [R] Hello R User

Hello R User,
In the sample data given below, time is recorded for each id
subsequently. For the analysis, for each id, I would like to set 1st
recorded time to zero and thereafter find the difference from previous
time. I.e. for ID==1, I would like to see Time=0,3,1,3,6. This needs
to be implemented to big data set.
Any suggestions are much appreciated!
Thanks,
Bibek

ID??? Time
1??? 3
1??? 6
1??? 7
1??? 10
1??? 16
2??? 12
2??? 18
2??? 19
2??? 25
2??? 28
2??? 30

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.