A query about na.omit
On Wed, 2009-04-01 at 16:49 +0100, Jose Iparraguirre D'Elia wrote:
Dear all, Say I have the following dataset:
DF
x y z [1] 1 1 1 [2] 2 2 2 [3] 3 3 NA [4] 4 NA 4 [5] NA 5 5 And I want to omit all the rows which have NA, but only in columns X and Y, so that I get: x y z 1 1 1 2 2 2 3 3 NA If I use na.omit(DF), I would delete the row for which z=NA, obtaining thus x y z 1 1 1 2 2 2 But this is not what I want, of course. If I use na.omit(DF[,1:2]), then I obtain x y 1 1 2 2 3 3 which is OK for x and y columns, but I wouldn't get the corresponding values for z (ie 1 2 NA) Any suggestions about how to obtain the desired results efficiently (the actual dataset has millions of records and almost 50 columns, and I would apply the procedure on 12 of these columns)? Sincerely, Jose Luis Jose Luis Iparraguirre Senior Research Economist Economic Research Institute of Northern Ireland
Hi Jose Luis, I think this script is sufficient for your problem: tab<-matrix(c(1,1,1,2,2,2,3,3,NA,4,NA,4,NA,5,5),ncol=3,byrow=T) tab[!is.na(tab[,1])&!is.na(tab[,2]),]
Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil