Skip to content
Prev 257562 / 398502 Next

Can R replicate this data manipulation in SAS?

I think this is kind of like asking "will your Land Rover make it up
my driveway?", but I'll assume the question was asked in all
seriousness.

Here is one solution:

## **** Read in test data;
dat <- read.table(textConnection("id    drug      start       stop
1004    NRTI     07/24/95    01/05/99
1004    NRTI     11/20/95 12/10/95
1004    NRTI     01/10/96    01/05/99
1004    PI       05/09/96    11/16/97
1004    NRTI     06/01/96    02/01/97
1004    NRTI     07/01/96    03/01/97
9999    PI       01/02/03    NA
9999    NNRTI    04/05/06    07/08/09"), header=TRUE)
closeAllConnections()

dat$start <- as.Date(dat$start, format = "%m/%d/%y")
dat$stop <- as.Date(dat$stop, format = "%m/%d/%y")

## **** Reshape data into series with 1 date rather than separate starts and
## stops;

library(reshape)

m.dat <- melt(dat, id = c("id", "drug"))
m.dat <- m.dat[order(m.dat$id, m.dat$value),]
m.dat$variable <- ifelse(m.dat$variable == "start", 1, -1)
names(m.dat) <-  c("id", "drug", "value", "date")
m.dat

## **** Get regimen information plus start and stop dates;

n.dat <- cast(m.dat, id + date ~ drug, fun.aggregate=sum, margins="grand_col")
for (i in names(n.dat)[-c(1:2)]) {
     n.dat[i] <- cumsum(n.dat[i])
   }
n.dat <- ddply(n.dat, .(id), transform,
      regimen = 1:length(id))
n.dat

ssd.dat <- ddply(n.dat, .(id), summarize,
                id = id[-1],
                regimen = regimen[-length(regimen)],
                 start_date = date[-length(date)],
                stop_date = date[-1])
ssd.dat

## **** Merge data to create regimens dataset;
all.dat <- merge(n.dat[-2], ssd.dat)
all.dat <- all.dat[order(all.dat$id, all.dat$regimen), c("id",
"start_date", "stop_date", "regimen", "NRTI", "NNRTI", "PI",
"X.all.")]
all.dat


Best,
Ista
On Wed, Apr 20, 2011 at 2:59 PM, Ted Harding <ted.harding at wlandres.net> wrote: