Skip to content

NA in character vectors

1 message · David Brahm

#
Hi all,

While R-1.5.0 is still unfrozen, I'd like to try again to generate interest in
my favorite pet peeve: NA's in character vectors. Last October I wrote:

            
We had an interesting discussion then, and I learned (from Duncan Murdoch and
Thomas Lumley) that R does have an internal code for missing char values
(R_NaString), but it gets easily confused with the string "NA".  Check this:

  R> z <- c(LETTERS[c(2,NA)], "NA", paste("NA"))
  R> is.na(z)
     [1] FALSE  TRUE  TRUE FALSE
  R> z[3]==z[4]
     [1] TRUE
  R> z=="NA"
     [1] FALSE  TRUE  TRUE  TRUE

Thomas Lumley <tlumley@u.washington.edu> suggested that this weird behavior
essentially arises because the parser converts "NA" to R_NaString, and so...
While I still like the simple S-Plus model (""), Thomas's suggestions (with
PRINTNAME(R_NaString) = "<NA>" for example) would be OK too.  Thanks for
listening (again)!