select observations from longitudinal data set
Try this. 'by' splits up the data frame into one data frame per id and then f acts separately on each such sub-dataframe returning a ts series with NAs for the missings. cbind'ing those all together gives us this series with one column per id:
tt
Time Series: Start = 1 End = 6 Frequency = 1 1 2 3 4 5 1 10 8 8 9 7 2 12 NA NA NA 9 3 15 9 NA NA NA 4 NA 11 16 NA NA 5 NA 12 NA 13 NA 6 18 NA NA NA 11 and finally we use a string of ifelse's to choose the correct values.
library(zoo) f <- function(d) as.ts(zoo(d$y, d$time, freq = 1)) tt <- do.call(cbind, by(dat, dat$id, f)) ifelse(is.na(tt[4,]), ifelse(is.na(tt[3,]), tt[5,], tt[3,]), tt[4,])
1 2 3 4 5 15 11 16 13 NA As in the example data, we have assumed that at least one of the sub-dataframes has a point at time 1 and at least one has a point at time 5.
On Sun, Jan 18, 2009 at 2:42 AM, gallon li <gallon.li at gmail.com> wrote:
I have the following longitudinal data:
id time y
1 1 10
1 2 12
1 3 15
1 6 18
2 1 8
2 3 9
2 4 11
2 5 12
3 1 8
3 4 16
4 1 9
4 5 13
5 1 7
5 2 9
5 6 11
....
I want to select the observations at time 4. if the observation at time 4 is
missing, then i want to slect the observation at time 3. if the observation
at time 3 is also missing, then i want to select observation at time 5.
otherwise i will put a missing value there. the selected set is like
id time y
1 3 15
2 4 11
3 4 16
4 5 13
5 4 NA
...
so the rule is (1) obs at time 4 for each id; (2) if no such obs, then look
for obs at time 3; (3) if no such obs, then look for obs at time 5; (4)
otherwise, NA.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.