maybe you could consider something like this:
dat <- data.frame(x = c(1, 2, 2, 3, 3, 4),
y1 = c(1, 1, 2, 1, 7, 8),
y2 = c(NA, NA, NA, 5, 5, 4),
y3 = c(3, 11, NA, 16, 2, 1))
#############
out <- as.data.frame(lapply(dat[-1], function(y, x) tapply(y, x, max,
na.rm = TRUE), x = dat["x"]))
out[out == -Inf] <- NA
out$x <- unique(dat["x"])
out
I hope it helps.
Best,
Dimitris
----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven
Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/16/336899
Fax: +32/16/337015
Web: http://www.med.kuleuven.be/biostat/http://www.student.kuleuven.be/~m0390867/dimitris.htm
----- Original Message -----
From: "Anders Bj??rges??ter" <anders.bjorgesater at bio.uio.no>
To: <r-help at stat.math.ethz.ch>
Sent: Wednesday, August 03, 2005 10:40 AM
Subject: [R] filter data set unique, duplicate..
Hello
First, thanks for the help for an earlier question about error
handling!
I have problem filtering a dataset.
I'm trying to filter the data in the y columns based on the values
in the x
column, e.g.:
x y1 y2 yn
1.0 1 NA 3
2.0 1 NA 11
2.0 2 NA NA
3.0 1 5 16
3.0 7 5 2
4.0 8 4 1
and want to keep the highest y if x is identical, like this:
x y1 y2 yn
1.0 1 NA 3
2.0 2 NA 11
3.0 7 5 16
4.0 8 4 1
or just as good:
x y1 y2 yn
1.0 1 NA 3
2.0 NA* NA NA
2.0 2 NA 11
3.0 NA* 5 16
3.0 7 NA* NA*
4.0 8 4 1
If any has any suggestions or pointers how to do this I would really
appreciate it.
/Anders