How to add unique occasions based on date within a subject in R?

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20131121/a383cef2/attachment.pl>
Hi,
May be you can try:
###Use dput()

dat1 <- structure(list(trialno = c(11301L, 11301L, 11301L, 11301L, 11301L, 
11301L, 11301L, 11301L, 11301L, 11301L, 11302L, 11302L, 11302L, 
11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L), event = c("pm_intake", 
"am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pk2", "pm_intake", 
"am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pm_intake", 
"am_intake", "pk1", "pk2", "pm_intake", "am_intake", "pk1"), 
??? date = c("2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", 
??? "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", 
??? "2011-02-03", "2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", 
??? "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", 
??? "2011-02-03"), time = c("19:00", "07:00", "10:30", "19:00", 
??? "07:00", "09:54", "13:07", "19:00", "07:00", "11:30", "19:00", 
??? "07:00", "10:30", "19:00", "07:00", "09:54", "13:07", "19:00", 
??? "07:00", "11:30")), .Names = c("trialno", "event", "date", 
"time"), class = "data.frame", row.names = c("3", "4", "5", "6", 
"7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", 
"18", "19", "20", "21", "22"))

splitData<- split(dat1, dat1$trialno) #using your code
res <-? unsplit(lapply(splitData,function(x) within(x,OCC <- cumsum(ave(seq_along(date),date,FUN=seq_along)==1))),dat1$trialno)

?res$OCC
?#[1] 1 2 2 3 4 4 4 5 6 6 1 2 2 3 4 4 4 5 6 6

A.K.
Hi All, 

I'm trying to figure out how in my data set to add a column including a
count of unique events based on date. Here is a part of my data set:

? ? ? ? ? ? ? ? trialno? ? ?  event? ? ? ? ? ? ? ? ?  date? ? ? ? ? time

3? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-11-24? ? ? ? ? 19:00

4? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-11-25? ? ? ? ? 07:00

5? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ?  2010-11-25
10:30

6? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-12-22? ? ? ? ? 19:00

7? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-12-23? ? ? ? ? 07:00

8? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2010-12-23
09:54

9? ? ? ? ? ? ? 11301? ? pk2? ? ? ? ? ? ? ? ? ? ? ?  2010-12-23
13:07

10? ? ? ? ?  11301? ? pm_intake? ? ? ? ? 2011-02-02? ? ? ? ? 19:00

11? ? ? ? ?  11301? ? am_intake? ? ? ? ? 2011-02-03? ? ? ? ? 07:00

12? ? ? ? ?  11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2011-02-03? ? ? ? ? 11:30

Basically each date within each patient would indicate a new occasion. If
patient has just drug administration - it's one occasion but if patient had
drug administration and two measurements on the same day, they all count as
the same occasion. The data set does not have a regular patters (each
patient has a different number of events on each date and events in total).

What I'm trying to achieve is:

? ? ? ? ? ? ? ? trialno? ? ?  event? ? ? ? ? ? ? ? ? ?  date? ? ? ? ? time
OCC

3? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-11-24? ? ? ? ? 19:00? ? ? 1

4? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-11-25? ? ? ? ? 07:00? ? ? 2

5? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ?  2010-11-25
10:30? ? ? 2

6? ? ? ? ? ? ? 11301? ? pm_intake? ? ? ? ? 2010-12-22? ? ? ? ? 19:00? ? ? 3

7? ? ? ? ? ? ? 11301? ? am_intake? ? ? ? ? 2010-12-23? ? ? ? ? 07:00? ? ? 4

8? ? ? ? ? ? ? 11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ?  2010-12-23
09:54? ? ? 4

9? ? ? ? ? ? ? 11301? ? pk2? ? ? ? ? ? ? ? ? ? ? ?  2010-12-23
13:07? ? ? 4

10? ? ? ? ?  11301? ? pm_intake? ? ? ? ? 2011-02-02? ? ? ? ? 19:00? ? ? 5

11? ? ? ? ?  11301? ? am_intake? ? ? ? ? 2011-02-03? ? ? ? ? 07:00? ? ? 6

12? ? ? ? ?  11301? ? pk1? ? ? ? ? ? ? ? ? ? ? ? 2011-02-03? ? ? ? ? 11:30
6

I think I should apply some kind of a loop to identify within each patient
unique dates and count them...

I thought about splitting the whole data set into patients using split
function:

splitData<- split(data, data$trialno)

And applying lapply and transform to add a new column OCC (occasion) but I
don't know how to count those as integers...

I was thinking:

splitData<- lapply(splitData, function(df) {

? ? ?  transform(df, OCC= ????????????????  )}

do.call ("rbind", splitData)

I know how to do it in Excell:

=IF(D5=D4, E4,E4+1)

(if the cell value in neighbouring cell is same as in the cell above, then
value in my cell is same as in one above, else it's one greater)-this way
first cell in E column has to be 1 and the others are integers of new date
events.

Help much appreciated!

Andrzej

??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Hi,
May be you can try:
###Use dput()

dat1 <- structure(list(trialno = c(11301L, 11301L, 11301L, 11301L, 11301L, 
11301L, 11301L, 11301L, 11301L, 11301L, 11302L, 11302L, 11302L, 
11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L), event = c("pm_intake", 
"am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pk2", "pm_intake", 
"am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pm_intake", 
"am_intake", "pk1", "pk2", "pm_intake", "am_intake", "pk1"), 
??? date = c("2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", 
??? "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", 
??? "2011-02-03", "2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", 
??? "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", 
??? "2011-02-03"), time = c("19:00", "07:00", "10:30", "19:00", 
??? "07:00", "09:54", "13:07", "19:00", "07:00", "11:30", "19:00", 
??? "07:00", "10:30", "19:00", "07:00", "09:54", "13:07", "19:00", 
??? "07:00", "11:30")), .Names = c("trialno", "event", "date", 
"time"), class = "data.frame", row.names = c("3", "4", "5", "6", 
"7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", 
"18", "19", "20", "21", "22"))