Skip to content

How to add unique occasions based on date within a subject in R?

3 messages · Andrzej Bienczak, arun

#
Hi,
May be you can try:
###Use dput()

dat1 <- structure(list(trialno = c(11301L, 11301L, 11301L, 11301L, 11301L, 
11301L, 11301L, 11301L, 11301L, 11301L, 11302L, 11302L, 11302L, 
11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L), event = c("pm_intake", 
"am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pk2", "pm_intake", 
"am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pm_intake", 
"am_intake", "pk1", "pk2", "pm_intake", "am_intake", "pk1"), 
??? date = c("2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", 
??? "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", 
??? "2011-02-03", "2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", 
??? "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", 
??? "2011-02-03"), time = c("19:00", "07:00", "10:30", "19:00", 
??? "07:00", "09:54", "13:07", "19:00", "07:00", "11:30", "19:00", 
??? "07:00", "10:30", "19:00", "07:00", "09:54", "13:07", "19:00", 
??? "07:00", "11:30")), .Names = c("trialno", "event", "date", 
"time"), class = "data.frame", row.names = c("3", "4", "5", "6", 
"7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", 
"18", "19", "20", "21", "22"))


splitData<- split(dat1, dat1$trialno) #using your code
res <-? unsplit(lapply(splitData,function(x) within(x,OCC <- cumsum(ave(seq_along(date),date,FUN=seq_along)==1))),dat1$trialno)

?res$OCC
?#[1] 1 2 2 3 4 4 4 5 6 6 1 2 2 3 4 4 4 5 6 6


A.K.
On Thursday, November 21, 2013 2:04 PM, Andrzej Bienczak <andrzej.bienczak at googlemail.com> wrote:
Hi All, 



I'm trying to figure out how in my data set to add a column including a
count of unique events based on date. Here is a part of my data set:



? ? ? ? ? ? ? ? trialno? ? ?  event? ? ? ? ? ? ? ? ?  date? ? ? ? ? time

3? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-11-24? ? ? ? ? 19:00

4? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-11-25? ? ? ? ? 07:00

5? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ?  2010-11-25
10:30

6? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-12-22? ? ? ? ? 19:00

7? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-12-23? ? ? ? ? 07:00

8? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2010-12-23
09:54

9? ? ? ? ? ? ? 11301? ? pk2? ? ? ? ? ? ? ? ? ? ? ?  2010-12-23
13:07

10? ? ? ? ?  11301? ? pm_intake? ? ? ? ? 2011-02-02? ? ? ? ? 19:00

11? ? ? ? ?  11301? ? am_intake? ? ? ? ? 2011-02-03? ? ? ? ? 07:00

12? ? ? ? ?  11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2011-02-03? ? ? ? ? 11:30







Basically each date within each patient would indicate a new occasion. If
patient has just drug administration - it's one occasion but if patient had
drug administration and two measurements on the same day, they all count as
the same occasion. The data set does not have a regular patters (each
patient has a different number of events on each date and events in total).

What I'm trying to achieve is:



? ? ? ? ? ? ? ? trialno? ? ?  event? ? ? ? ? ? ? ? ? ?  date? ? ? ? ? time
OCC

3? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-11-24? ? ? ? ? 19:00? ? ? 1

4? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-11-25? ? ? ? ? 07:00? ? ? 2

5? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ?  2010-11-25
10:30? ? ? 2

6? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-12-22? ? ? ? ? 19:00? ? ? 3

7? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-12-23? ? ? ? ? 07:00? ? ? 4

8? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ?  2010-12-23
09:54? ? ? 4

9? ? ? ? ? ? ? 11301? ? pk2? ? ? ? ? ? ? ? ? ? ? ?  2010-12-23
13:07? ? ? 4

10? ? ? ? ?  11301? ? pm_intake? ? ? ? ? 2011-02-02? ? ? ? ? 19:00? ? ? 5

11? ? ? ? ?  11301? ? am_intake? ? ? ? ? 2011-02-03? ? ? ? ? 07:00? ? ? 6

12? ? ? ? ?  11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2011-02-03? ? ? ? ? 11:30
6



I think I should apply some kind of a loop to identify within each patient
unique dates and count them...

I thought about splitting the whole data set into patients using split
function:

splitData<- split(data, data$trialno)



And applying lapply and transform to add a new column OCC (occasion) but I
don't know how to count those as integers...

I was thinking:



splitData<- lapply(splitData, function(df) {

? ? ?  transform(df, OCC= ????????????????  )}

do.call ("rbind", splitData)



I know how to do it in Excell:

=IF(D5=D4, E4,E4+1)

(if the cell value in neighbouring cell is same as in the cell above, then
value in my cell is same as in one above, else it's one greater)-this way
first cell in E column has to be 1 and the others are integers of new date
events.

Help much appreciated!

Andrzej




??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Hi,
May be you can try:
###Use dput()

dat1 <- structure(list(trialno = c(11301L, 11301L, 11301L, 11301L, 11301L, 
11301L, 11301L, 11301L, 11301L, 11301L, 11302L, 11302L, 11302L, 
11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L), event = c("pm_intake", 
"am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pk2", "pm_intake", 
"am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pm_intake", 
"am_intake", "pk1", "pk2", "pm_intake", "am_intake", "pk1"), 
??? date = c("2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", 
??? "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", 
??? "2011-02-03", "2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", 
??? "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", 
??? "2011-02-03"), time = c("19:00", "07:00", "10:30", "19:00", 
??? "07:00", "09:54", "13:07", "19:00", "07:00", "11:30", "19:00", 
??? "07:00", "10:30", "19:00", "07:00", "09:54", "13:07", "19:00", 
??? "07:00", "11:30")), .Names = c("trialno", "event", "date", 
"time"), class = "data.frame", row.names = c("3", "4", "5", "6", 
"7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", 
"18", "19", "20", "21", "22"))


splitData<- split(dat1, dat1$trialno) #using your code
res <-? unsplit(lapply(splitData,function(x) within(x,OCC <- cumsum(ave(seq_along(date),date,FUN=seq_along)==1))),dat1$trialno)

?res$OCC
?#[1] 1 2 2 3 4 4 4 5 6 6 1 2 2 3 4 4 4 5 6 6

#or

?within(dat1,OCC <- as.numeric(ave(date,trialno,FUN= function(x) cumsum(ave(seq_along(x),x,FUN=seq_along)==1))))


A.K.
On Thursday, November 21, 2013 2:04 PM, Andrzej Bienczak <andrzej.bienczak at googlemail.com> wrote:
Hi All, 



I'm trying to figure out how in my data set to add a column including a
count of unique events based on date. Here is a part of my data set:



? ? ? ? ? ? ? ? trialno? ? ?? event? ? ? ? ? ? ? ? ?? date? ? ? ? ? time

3? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-11-24? ? ? ? ? 19:00

4? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-11-25? ? ? ? ? 07:00

5? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ?? 2010-11-25
10:30

6? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-12-22? ? ? ? ? 19:00

7? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-12-23? ? ? ? ? 07:00

8? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2010-12-23
09:54

9? ? ? ? ? ? ? 11301? ? pk2? ? ? ? ? ? ? ? ? ? ? ?? 2010-12-23
13:07

10? ? ? ? ?? 11301? ? pm_intake? ? ? ? ? 2011-02-02? ? ? ? ? 19:00

11? ? ? ? ?? 11301? ? am_intake? ? ? ? ? 2011-02-03? ? ? ? ? 07:00

12? ? ? ? ?? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2011-02-03? ? ? ? ? 11:30







Basically each date within each patient would indicate a new occasion. If
patient has just drug administration - it's one occasion but if patient had
drug administration and two measurements on the same day, they all count as
the same occasion. The data set does not have a regular patters (each
patient has a different number of events on each date and events in total).

What I'm trying to achieve is:



? ? ? ? ? ? ? ? trialno? ? ?? event? ? ? ? ? ? ? ? ? ?? date? ? ? ? ? time
OCC

3? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-11-24? ? ? ? ? 19:00? ? ? 1

4? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-11-25? ? ? ? ? 07:00? ? ? 2

5? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ?? 2010-11-25
10:30? ? ? 2

6? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-12-22? ? ? ? ? 19:00? ? ? 3

7? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-12-23? ? ? ? ? 07:00? ? ? 4

8? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ?? 2010-12-23
09:54? ? ? 4

9? ? ? ? ? ? ? 11301? ? pk2? ? ? ? ? ? ? ? ? ? ? ?? 2010-12-23
13:07? ? ? 4

10? ? ? ? ?? 11301? ? pm_intake? ? ? ? ? 2011-02-02? ? ? ? ? 19:00? ? ? 5

11? ? ? ? ?? 11301? ? am_intake? ? ? ? ? 2011-02-03? ? ? ? ? 07:00? ? ? 6

12? ? ? ? ?? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2011-02-03? ? ? ? ? 11:30
6



I think I should apply some kind of a loop to identify within each patient
unique dates and count them...

I thought about splitting the whole data set into patients using split
function:

splitData<- split(data, data$trialno)



And applying lapply and transform to add a new column OCC (occasion) but I
don't know how to count those as integers...

I was thinking:



splitData<- lapply(splitData, function(df) {

? ? ?? transform(df, OCC= ????????????????? )}

do.call ("rbind", splitData)



I know how to do it in Excell:

=IF(D5=D4, E4,E4+1)

(if the cell value in neighbouring cell is same as in the cell above, then
value in my cell is same as in one above, else it's one greater)-this way
first cell in E column has to be 1 and the others are integers of new date
events.

Help much appreciated!

Andrzej




??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.