Creating missingness in repeated measurement data
On Sep 17, 2012, at 11:32 AM, john james wrote:
Dear R users, I have the following problems. My dataset (dat) is as follows: a <- c(1,2,3) id <- rep(a, c(3,2,3)) stat <- c(1,1,0,1,0,1,1,1) g <- c(0,0,0,0,0,0,1,0) stop <- c(1,2,4,2,4,1,1.5,3) dat <- data.frame(id,stat,g,stop) I want to creat a new dataset (dat2) with missing values such that when either g = =1 or stat = =0, the remaining rows for an individual subject is set to NA by using a new variable d (that states the exact time this happened from the stop variable). By this I mean dat2 that looks like, id <- rep(a, c(3,2,3)) sta2<- c(1,1,NA,1,NA,1,NA,NA) g2<- c(0,0,NA,0,NA,0,NA,NA) stop2 <- c(1,2,NA,2,NA,1,NA,NA) d <- c(4,4,NA,4,NA,1.5,NA,NA) dat2 <- data.frame(id=id, stat2=sta2, g2=g2,stop2=stop2,d=d).
suppressidx <- ave(dat$stat==0 | dat$g==1, dat$id, FUN=cumsum)
suppress <- function(col) { ifelse( suppressidx, NA, col)}
cbind(dat[1], sapply( dat[-1], function(x) suppress(x) ) )
id stat g stop 1 1 1 0 1 2 1 1 0 2 3 1 NA NA NA 4 2 1 0 2 5 2 NA NA NA 6 3 1 0 1 7 3 NA NA NA 8 3 NA NA NA
David Winsemius, MD Alameda, CA, USA