Why is there no c.factor?
A search for "c.factor" returns tons of hits on this topic. Heres just one of the hits from 2006, when I asked the same question : http://tolstoy.newcastle.edu.au/R/e2/devel/06/11/1137.html So it appears to be complicated and there are good reasons. Since I needed it, I created c.factor in data.table package, below. It does it more efficiently since it doesn't convert each factor to character (hence losing some of the benefit). I've been told I'm not unique in this approach and that other packages also have their own c.factor. It deliberately isn't exported. Its worked well for me over the years anyway. c.factor = function(...) { args <- list(...) for (i in seq(along=args)) if (!is.factor(args[[i]])) args[[i]] = as.factor(args[[i]]) # The first must be factor otherwise we wouldn't be inside c.factor, its checked anyway in the line above. newlevels = sort(unique(unlist(lapply(args,levels)))) ans = unlist(lapply(args, function(x) { m = match(levels(x), newlevels) m[as.integer(x)] })) levels(ans) = newlevels class(ans) = "factor" ans } "Hadley Wickham" <hadley at rice.edu> wrote in message news:f8e6ff051002040753x33282f33l78fce9f98dc29ae8 at mail.gmail.com...
Hi all,
Is there are reason that there is no c.factor method? Analogous to
c.Date, I'd expect something like the following to be useful:
c.factor <- function(...) {
factors <- list(...)
levels <- unique(unlist(lapply(factors, levels)))
char <- unlist(lapply(factors, as.character))
factor(char, levels = levels)
}
c(factor("a"), factor("b"), factor(c("c", "b","a")), factor("d"))
# [1] a b c b a d
# Levels: a b c d
Hadley
--
http://had.co.nz/