(PR#1608) merge.data.frame can coerce character vectors to factor in some circumstances (PR#1608)
On Wed, 29 May 2002 a296180@agate.fmr.com wrote:
If the following two conditions are met: 1) all.x is TRUE 2) at least 1 row in y does not have a match in x then any character vectors in y will be coerced to be factors. Here is a simple example (previously provided on r-devel):
x <- data.frame(a = 1:4) y <- data.frame(b = LETTERS[1:3]) y$b <- as.character(y$b) z <- merge(x, y, by = 0, all.x = TRUE) z
Row.names a b 1 1 1 A 2 2 2 B 3 3 3 C 4 4 4 <NA>
sapply(z, data.class)
Row.names a b "factor" "numeric" "factor"
This problem could be fixed by changing the line in merge.data.frame: for (i in seq(along = y)) is.na(y[[i]]) <- (lxy + 1):(lxy + nxx) to: for (i in seq(along = y)) y[((lxy + 1):(lxy + nxx)), i] <- NA
But other problems would be introduced, as the two operations are not equivalent (and the right one has been used).
To the extent that this is a feature rather than a bug (if so, I would like to know why),
I have already patiently explained it to you. It is a side issue of subscripting of data frames converting character columns to factor. I have also given you a workaround.
then I would suggest that the following sentence be added to the documentation for merge at the end of the section on all.x "Be aware that, if all.x equals `TRUE', character vectors in `y' will be converted to factors if any rows in y have no matching row in `x'."
As I said before, this is a consequence of the general rules. Data frames are not designed to have character columns, and those who insist on using them must make themselves aware of the consequences.
Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._