Skip to content
Back to formatted view

Raw Message

Message-ID: <49CF6AF6.3040904@idi.ntnu.no>
Date: 2009-03-29T12:35:02Z
From: Wacek Kusnierczyk
Subject: select observations from longitudinal data
In-Reply-To: <49CF64AB.6090202@biostat.ku.dk>

Peter Dalgaard wrote:
>
>>
>>     times = 3:4
>>     do.call(rbind, by(data, data$id, function(data)
>>         with(data, {
>>             rows = (time == times[which(times %in% time)[1]])
>>             if (is.na(rows[1])) data.frame(id=id, time=NA, x=NA) else
>> data[rows,] })))
>>
>>     #   id time  x
>>     # 1  1    3 23
>>     # 2  2    3 13
>>     # 3  3    3 15
>>     # 4  4    3 27
>>
>> is this what you wanted?
>
> There's also the straightforward answer:
>
> > sapply(split(data,data$id), function(d) { r <- d$x[d$time==3]
> +    if(!length(r)) r <- d$x[d$time==4]
> +    if(!length(r)) NA
> +    r})
>  1  2  3  4
> 23 13 15 27
>
> or, just to checkout the case where time==3 is actually missing:
>
> > sapply(split(data[-c(6,13),],data$id[-c(6,13)]), function(d) {
> +    r <- d$x[d$time==3]
> +    if(!length(r)) r <- d$x[d$time==4]
> +    if(!length(r)) r <- NA
> +    r})
>  1  2  3  4
> 23 14 15 NA

indeed, and although the output is not a data frame and does not report
the time actually used, it should be easy to add this if needed.  your
solution is more efficient, and if the output is sufficient, it might be
preferable.

vQ