A query about na.omit
On 01-Apr-09 15:49:40, Jose Iparraguirre D'Elia wrote:
Dear all, Say I have the following dataset:
DF
x y z [1] 1 1 1 [2] 2 2 2 [3] 3 3 NA [4] 4 NA 4 [5] NA 5 5 And I want to omit all the rows which have NA, but only in columns X and Y, so that I get: x y z 1 1 1 2 2 2 3 3 NA
Roll up your sleeves, and spell out in detail the condition you need: DF<-data.frame(x=c(1,2,3,4,NA),y=c(1,2,3,NA,5),z=c(1,2,NA,4,5)) DF # x y z # 1 1 1 1 # 2 2 2 2 # 3 3 3 NA # 4 4 NA 4 # 5 NA 5 5 DF[!(is.na(rowSums(DF[,(1:2)]))),] # x y z # 1 1 1 1 # 2 2 2 2 # 3 3 3 NA Hoping this helps, Ted.
If I use na.omit(DF), I would delete the row for which z=NA, obtaining
thus
x y z
1 1 1
2 2 2
But this is not what I want, of course.
If I use na.omit(DF[,1:2]), then I obtain
x y
1 1
2 2
3 3
which is OK for x and y columns, but I wouldn't get the corresponding
values for z (ie 1 2 NA)
Any suggestions about how to obtain the desired results efficiently
(the actual dataset has millions of records and almost 50 columns, and
I would apply the procedure on 12 of these columns)?
Sincerely,
Jose Luis
Jose Luis Iparraguirre
Senior Research Economist
Economic Research Institute of Northern Ireland
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 01-Apr-09 Time: 18:00:53 ------------------------------ XFMail ------------------------------