Characters vectors, NA's and "" in merges
On Wed, 26 Sep 2001, David Kane <David Kane wrote:
I often use merge with dataframes that contain character vectors which have elements that are sometimes "NA" (meaning the string NA, not the same thing, obviously, as NA in a numeric or factor vector). For example, the stock ticker for Nabisco was "NA". Unfortunately (for me), it seems like merge insists on inserting "NA" for missing values. My question: Is there some way around this?
Here is a simple example:
version
_ platform sparc-sun-solaris2.6 arch sparc os solaris2.6 system sparc, solaris2.6 status major 1 minor 3.0 year 2001 month 06 day 22 language R
a <- data.frame(x = 1:4)
b <- data.frame(x = 1:3, y = c("NA", "a", "b"))
Take a look. b$y is a factor with levels "a" and "b", and a missing first value.
merge(a, b, all.x = TRUE)
x y 1 1 NA 2 2 a 3 3 b 4 4 NA Rows 1:3 are what I expect them to be. Row 4 is "wrong" in the sense that dataframe b did not contain a row for x = 4. Of course, there is a sense that *any* value, including "", that is placed in row 4 is potentially misleading. Perhaps I am misunderstanding the meaning of "NA" in a character vector (i.e., I am not allowed to have "real" values that are that string).
That is the correct answer. Because you asked for all.x=TRUE, you got a missing value there in row 4 col 2.
If there were some way (an "nomatch" argument?) that the user could specify what missing values are used for character strings, then I would be fine. Again, I suspect that my real problem is not understanding how to specify "NA" -- meaning Nabisco's ticker symbol -- in a character vector.
You cannot avoid it being taken as the missing value, AFAIK.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._