I often use merge with dataframes that contain character vectors which have elements that are sometimes "NA" (meaning the string NA, not the same thing, obviously, as NA in a numeric or factor vector). For example, the stock ticker for Nabisco was "NA". Unfortunately (for me), it seems like merge insists on inserting "NA" for missing values. My question: Is there some way around this? Here is a simple example:
version
_ platform sparc-sun-solaris2.6 arch sparc os solaris2.6 system sparc, solaris2.6 status major 1 minor 3.0 year 2001 month 06 day 22 language R
a <- data.frame(x = 1:4)
b <- data.frame(x = 1:3, y = c("NA", "a", "b"))
merge(a, b, all.x = TRUE)
x y 1 1 NA 2 2 a 3 3 b 4 4 NA Rows 1:3 are what I expect them to be. Row 4 is "wrong" in the sense that dataframe b did not contain a row for x = 4. Of course, there is a sense that *any* value, including "", that is placed in row 4 is potentially misleading. Perhaps I am misunderstanding the meaning of "NA" in a character vector (i.e., I am not allowed to have "real" values that are that string). If there were some way (an "nomatch" argument?) that the user could specify what missing values are used for character strings, then I would be fine. Again, I suspect that my real problem is not understanding how to specify "NA" -- meaning Nabisco's ticker symbol -- in a character vector. Any suggestions would be much appreciated. Dave Kane -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._