Question about a perceived irregularity in R syntax
Nordlund, Dan (DSHS/RDA) wrote:
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of Peter Dalgaard
Sent: Thursday, July 22, 2010 3:13 PM
To: Pat Schmitz
Cc: r-help at r-project.org
Subject: Re: [R] Question about a perceived irregularity in R syntax
Pat Schmitz wrote:
Both vector query's can select the values from the data.frame as
written,
however in the first form assigning a value to said selected numbers
fails.
Can you explain the reason this fails?
dat <- data.frame(index = 1:10, Value = c(1:4, NA, 6, NA, 8:10))
dat$Value[dat$Value == "NA"] <- 1 #Why does this fails to work,
dat$Value[dat$Value %in% NA] <- 1 #While this does work?
#Particularly when str() results in an equivalent class
dat <- data.frame(index = 1:10, Value = c(1:4, NA, 6, NA, 8:10))
str(dat$Value[dat$Value %in% NA])
str(dat$Value[dat$Value == "NA"])
1. NA and "NA" are very different things
2. checkout is.na() and its help page
I also would have suggested is.na to do the replacement. What surprised me was that dat$Value[dat$Value %in% NA] <- 1 actually worked. I guess I always assumed that if
NA == NA
[1] NA then an attempt to compare NA to elements in a vector would also return NA, but not so.
NA %in% c(1,NA,3)
[1] TRUE Learned something new today,
I suspect that's not intentional, though I'm not sure it should be fixed. According to the usual convention the result should be a logical NA. Duncan Murdoch