sometimes removing NAs from code
On Oct 26, 2011, at 10:25 AM, Schatzi wrote:
Sometimes I have NA values within specific columns of a dataframe (in this example, the first two columns can have NAs). If there are NA values, I would like them to be removed. I have been using the code: y<-c(NA,5,4,2,5,6,NA) z<-c(NA,3,4,NA,1,3,7) x<-1:7 adata<-data.frame(y,z,x) adata<-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x)))),] This works well if there are NA values, but when a dataset doesn't have NA values, this code messes up the dataframe. I was trying to pick apart this code and could not understand why it didn't work when there were no NA values. If there are no NA values and I run just the part: apply(adata[,1:2],1,function(x)any(is.na(x))) it results in: 2 3 5 6 FALSE FALSE FALSE FALSE I was thinking that I can put in an if statement, but I think there has to be a better way. Any ideas/help? Thank you.
Presuming that you want to remove an entire row, if any of the elements in that row are NA's, see ?na.omit
na.omit(adata)
y z x 2 5 3 2 3 4 4 3 5 5 1 5 6 6 3 6 HTH, Marc Schwartz