An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120901/7fa35f61/attachment.pl>
R_closest date
3 messages · Weijia Wang, Rui Barradas, arun
Hello, Try the following. dat <- read.table(text=" PT_ID IDX_DT OBS_DATE DAYS_DIFF OBS_VALUE CATEGORY 13 4549 2002-08-21 2002-08-20 -1 183 2 14 4549 2002-08-21 2002-11-14 85 91 1 15 4549 2002-08-21 2003-02-18 181 89 1 16 4549 2002-08-21 2003-05-15 267 109 2 17 4549 2002-08-21 2003-12-16 482 96 1 128 4839 2006-11-28 2006-11-28 0 179 2 ", header=TRUE) spl <- split(dat, dat$PT_ID) idx <- sapply(spl, function(x) which.min(x$DAYS_DIFF)) res <- lapply(names(idx), function(nm) spl[[ nm ]][ idx[nm], ]) do.call(rbind, res) And assign the return value of do.call to your result (reuse 'res'). Hope this helps, Rui Barradas Em 01-09-2012 18:10, WANG WEIJIA escreveu:
Hi,
I have encountered an issue about finding a date closest to another date
So this is how the data frame looks like:
PT_ID IDX_DT OBS_DATE DAYS_DIFF OBS_VALUE CATEGORY
13 4549 2002-08-21 2002-08-20 -1 183 2
14 4549 2002-08-21 2002-11-14 85 91 1
15 4549 2002-08-21 2003-02-18 181 89 1
16 4549 2002-08-21 2003-05-15 267 109 2
17 4549 2002-08-21 2003-12-16 482 96 1
128 4839 2006-11-28 2006-11-28 0 179 2
I need to find, the single observation, which has the closest date of 'OBS_DATE' to 'IDX_DT'.
For example, for 'PT_ID' of 4549, I need row 13, of which the OBS_DATE is just one day away from IDX_DT.
I was thinking about using abs(), and I got this:
baseline<- function(x){
+
+ #remove all uncessary variables
+ baseline<- x[,c("PT_ID","DAYS_DIFF")]
+
+ #get a list of every unique ID
+ uniqueID <- unique(baseline$PT_ID)
+
+ #make a vector that will contain the smallest DAYS_DIFF
+ first <- rep(-99,length(uniqueID))
+
+ i = 1
+ #loop through each unique ID
+ for (PT_ID in uniqueID){
+
+ #for each iteration get the smallest DAYS_DIFF for that ID
+ first[i] <- min(baseline[which(baseline$PT_ID==PT_ID),abs(baseline$DAYS_DIFF)])
+
+ #up the iteration counter
+ i = i + 1
+
+ }
+ #make a data frame with the lowest DAYS_DIFF and ID
+ newdata <- data.frame(uniqueID,first)
+ names(newdata) <- c("PT_ID","DAYS_DIFF")
+
+ #return the data frame containing the lowest GPI for each ID
+ return(newdata)
+ }
ldl.b<-baseline(ldl) #get all baseline ldl patient ID, total 11368 obs, all unique#
Error in `[.data.frame`(baseline, which(baseline$PT_ID == PT_ID), abs(baseline$DAYS_DIFF)) : undefined columns selected Can anyone help me in figuring out how to get the minimum value of the absolute value of DAYS_DIFF for unique ID? Thanks a lot [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi,
Try this:
dat1 <- read.table(text="
? PT_ID??? IDX_DT? OBS_DATE DAYS_DIFF OBS_VALUE CATEGORY
13? 4549 2002-08-21 2002-08-20??????? -1????? 183??????? 2
14? 4549 2002-08-21 2002-11-14??????? 85??????? 91??????? 1
15? 4549 2002-08-21 2003-02-18????? 181??????? 89??????? 1
16? 4549 2002-08-21 2003-05-15????? 267????? 109??????? 2
17? 4549 2002-08-21 2003-12-16????? 482??????? 96??????? 1
128? 4839 2006-11-28 2006-11-28??????? 0????? 179??????? 2
", header=TRUE)
dat3<-aggregate(DAYS_DIFF~PT_ID,data=dat1,min)
merge(dat1,dat3)
#? PT_ID DAYS_DIFF???? IDX_DT?? OBS_DATE OBS_VALUE CATEGORY
#1? 4549??????? -1 2002-08-21 2002-08-20?????? 183??????? 2
#2? 4839???????? 0 2006-11-28 2006-11-28?????? 179??????? 2
#or,
dat2<- tapply(dat1$DAYS_DIFF,dat1$PT_ID,min)
dat4<-data.frame(PT_ID=row.names(data.frame(dat2)),DAYS_DIFF=dat2)
?row.names(dat4)<-1:nrow(dat4)
merge(dat1,dat4)
#? PT_ID DAYS_DIFF???? IDX_DT?? OBS_DATE OBS_VALUE CATEGORY
#1? 4549??????? -1 2002-08-21 2002-08-20?????? 183??????? 2
#2? 4839???????? 0 2006-11-28 2006-11-28?????? 179??????? 2
A.K.
----- Original Message -----
From: WANG WEIJIA <wwang.nyu at gmail.com>
To: "r-help at R-project.org" <r-help at r-project.org>
Cc:
Sent: Saturday, September 1, 2012 1:10 PM
Subject: [R] R_closest date
Hi,
I have encountered an issue about finding a date closest to another date
So this is how the data frame looks like:
? ? PT_ID? ? IDX_DT? OBS_DATE DAYS_DIFF OBS_VALUE CATEGORY
13? 4549 2002-08-21 2002-08-20? ? ? ? -1? ? ? 183? ? ? ? 2
14? 4549 2002-08-21 2002-11-14? ? ? ? 85? ? ? ? 91? ? ? ? 1
15? 4549 2002-08-21 2003-02-18? ? ? 181? ? ? ? 89? ? ? ? 1
16? 4549 2002-08-21 2003-05-15? ? ? 267? ? ? 109? ? ? ? 2
17? 4549 2002-08-21 2003-12-16? ? ? 482? ? ? ? 96? ? ? ? 1
128? 4839 2006-11-28 2006-11-28? ? ? ? 0? ? ? 179? ? ? ? 2
I need to find, the single observation, which has the closest date of 'OBS_DATE' to 'IDX_DT'.
For example, for 'PT_ID' of 4549, I need row 13, of which the OBS_DATE is just one day away from IDX_DT.
I was thinking about using abs(), and I got this:
baseline<- function(x){
+?
+? #remove all uncessary variables
+? baseline<- x[,c("PT_ID","DAYS_DIFF")]
+?
+? #get a list of every unique ID
+? uniqueID <- unique(baseline$PT_ID)
+?
+? #make a vector that will contain the smallest DAYS_DIFF
+? first <- rep(-99,length(uniqueID))
+?
+? i = 1
+? #loop through each unique ID
+? for (PT_ID in uniqueID){
+?
+? #for each iteration get the smallest DAYS_DIFF for that ID
+? first[i] <- min(baseline[which(baseline$PT_ID==PT_ID),abs(baseline$DAYS_DIFF)])
+?
+? #up the iteration counter
+? i = i + 1
+?
+? }
+? #make a data frame with the lowest DAYS_DIFF and ID
+? newdata <- data.frame(uniqueID,first)
+? names(newdata) <- c("PT_ID","DAYS_DIFF")
+?
+? #return the data frame containing the lowest GPI for each ID
+? return(newdata)
+? }
ldl.b<-baseline(ldl) #get all baseline ldl patient ID, total 11368 obs, all unique#
Error in `[.data.frame`(baseline, which(baseline$PT_ID == PT_ID), abs(baseline$DAYS_DIFF)) : ? undefined columns selected Can anyone help me in figuring out how to get the minimum value of the absolute value of DAYS_DIFF for unique ID? Thanks a lot ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.