Dear R-help reader, it would be great if someone knows what I'm doing wrong. I have (shorten) dataframe, which consists of a group identification and a number
ex
UID REL 1 R1.B8.31 0.000 2 R1.B8.31 0.000 3 R1.B8.31 0.000 4 R1.B8.31 0.000 5 R1.B8.38 0.010 6 R1.B8.38 0.060 7 R1.B8.38 0.006 8 R1.B8.38 0.010 9 R1.B8.48 0.080 10 R1.B8.48 NA 11 R1.B8.48 0.006 I'm creating now a subset missing the values 0 and "NA"
newex<-subset(ex,ex$REL>0) newex
UID REL 5 R1.B8.38 0.010 6 R1.B8.38 0.060 7 R1.B8.38 0.006 8 R1.B8.38 0.010 9 R1.B8.48 0.080 11 R1.B8.48 0.006 and now would like to apply the mean to each group in (UID)
tapply(newex$REL,newex$UID,mean,rm.na=T)
R1.B8.31 R1.B8.38 R1.B8.48
NA 0.0215 0.0430
to my surprise, I still have the mean for group R1.B8.31, which has
been removed by the subset function before.
I can remove the NA by
tapply(newex$REL,interaction(newex$UID,drop=T),mean,rm.na=T)
but I would like to know why the tapply still uses the original dataframe.
Many thanks for your help
Frank
Frank Mattes, e-mail: f.mattes at ucl.ac.uk Department of Virology fax 0044(0)207 8302854 Royal Free Hospital and tel 0044(0)207 8302997 University College Medical School London