Set operation generics
Would anyone be interested in reviewing a patch to make the set operations (union, intersect, setdiff, setequal, is.element) generic?
S3 generics, S4 generics, or primitives?
I would expect S3. Can you even have an S4 generic in the base package? (i.e. before the methods package is loaded)
Note that all those things heavily rely on match() behind the scene.
If match() itself was an S4 generic (or a primitive like c() and [)
then union(), intersect(), setdiff(), is.element() could be defined
with something like:
union <- function(x, y)
{
xy <- c(x, y)
sm <- match(xy, xy)
xy[sm == seq_along(sm)]
}
intersect <- function(x, y)
{
sm <- match(x, x)
x <- x[sm == seq_along(sm)]
m <- match(x, y)
x[!is.na(m)]
}
setequal <- function(x, y)
{
!(anyNA(match(x, y)) || anyNA(match(x, y)))
}
Although I suspect R-core would prefer a minimal change where it's easier to see that existing behaviour is preserved.
For the record, match(), union(), intersect(), and setdiff() are S4 generics in the BiocGenerics package. But there is no doubt it would be a better/cleaner situation if base::match() itself was an S4 generic or primitive.
By primitive, you mean internal generic? Hadley
Chief Scientist, RStudio http://had.co.nz/