Skip to content

plotting and coloring longitudinal data with three time points (ggplot2)

4 messages · Eric Fail, Jim Lemon, Hadley Wickham

#
Dear list,

I have been struggling with this for some time now, and for the last hour I have been struggling to make a working example for the list. I hope someone out there have some experience with plotting longitudinal data that they will share.

My data is some patient data with three different time stamps. First the patients are identified at different times (first time stamp). Second, they go through an assessment phase and begin their treatment (time stamp 2). Finally they are admitted from the hospital at some point (time stamp 3),

I would like to make a spaghetti plot with the assessment phase in one color and the treatment phase in another color.

I used ggplot2, and with this example data and only two time points; it works fine (I call it my working example),

library(ggplot2)
df <- data.frame( 
 ?date = seq(Sys.Date(), len=104, by="1 day")[sample(104, 52)], 
 ? patient = factor(rep(1:26, 2), labels = LETTERS)
 ) 
df <- df[order(df$date), ] 
dt <- qplot(date, patient, data=df, geom="line") 
dt + scale_x_date()
df[ which(df$patient=='E'), c("patient", "date")]

But, if I have three time points, R, for some reason I do not yet understand, add the two second time points in some funny way.

Finally, when that is solved; how do I colorize the different parts of the line so the assessment phase gets one color and the treatment phase another?

I want to be able to show how long we have been in contact with our patients, how much of the contact time that was assessment and how much that was actual treatment.

Below is an example (I call it the not-working example)

df2 <- data.frame( 
 ?date2 = seq(Sys.Date(), len= 156, by="2 day")[sample(156, 78)], 
 ?patient2 = factor(rep(1:26, 3), labels = LETTERS)
 )

df2 <- df2[order(df2$date2), ] 
dt2 <- qplot(date2, patient2, data=df2, geom="line") 
dt2 + scale_x_date(major="months", minor="weeks") 
df2[ which(df2$patient2=='B'), c("patient2", "date2")]

If someone can point me in a direction or tell me what I am doing wrong or if there is some amazing package for plotting longitudinal data I would be very grateful.

Thanks,
Eric
#
On 12/07/2011 08:02 PM, Eric Fail wrote:
Hi Eric,
Try this, I think it does more or less what you want. I tried to work 
this out with matplot, but couldn't.

library(plotrix)
df2<-data.frame(dates=c(base_dates,dates2,dates3),patients=rep(LETTERS,3),
  occasion=rep(c("Assessment","Treatment","Hospital"),each=26))
plot(df2$dates,as.numeric(factor(df2$patients)),
  main="Dates of treatment stages by patient",
  xlab="Date",ylab="Patient",axes=FALSE,pch=rep(c("A","T","H"),each=26))
axis.dates<-c("2011-01-01","2011-03-01","2011-05-01","2011-07-01",
  "2011-09-01","2011-11-01")
axis(1,at=as.Date(axis.dates,"%Y-%m-%d"),labels=axis.dates)
staxlab(2,at=1:26,labels=LETTERS)
box()
for(i in 1:26) {
  lines(df2$dates[c(i,i+26)],c(i,i),col=2)
  lines(df2$dates[c(i+26,i+52)],c(i,i),col=3)
}

Jim
#
On Wed, Dec 7, 2011 at 4:02 AM, Eric Fail <eric.fail at gmx.us> wrote:
Did you mean something like this?

library(ggplot2)
library(plyr)

df2 <- data.frame(
  date2 = seq(Sys.Date(), len= 156, by="2 day")[sample(156, 78)],
  patient2 = factor(rep(1:26, 3), labels = LETTERS)
)

df2 <- ddply(df2, "patient2", mutate, visit = order(date2))

qplot(date2, patient2, data = df2, geom = "line") +
  geom_point(aes(colour = factor(visit)))

# or this?

library(ggplot2)
library(plyr)

df2 <- data.frame(
  date2 = seq(Sys.Date(), len= 156, by="2 day")[sample(156, 78)],
  patient2 = factor(rep(1:26, 3), labels = LETTERS)
)

df2 <- ddply(df2, "patient2", mutate, visit = order(date2))

qplot(date2, patient2, data = df2, geom = "line", colour =
factor(visit), group = patient2)

# Obviously the lines are drawn between the observations so you only
see the first two visits.

Hadley
#
Thank you for solving my problem, it worked out beautifully.

This was exactly what I was looking for, the ggplot2 package keeps
impressing me.

Thanks,
Eric
On Wed, Dec 7, 2011 at 6:01 AM, Hadley Wickham <hadley at rice.edu> wrote: