Skip to content
Prev 47022 / 63424 Next

Set operation generics

Hi Hadley,
On 10/21/2013 10:51 AM, Hadley Wickham wrote:
S3 generics, S4 generics, or primitives?

Since they are binary operations, sounds like supporting multiple
dispatch would be a plus.

Note that all those things heavily rely on match() behind the scene.
If match() itself was an S4 generic (or a primitive like c() and [)
then union(), intersect(), setdiff(), is.element() could be defined
with something like:

   union <- function(x, y)
   {
     xy <- c(x, y)
     sm <- match(xy, xy)
     xy[sm == seq_along(sm)]
   }

   intersect <- function(x, y)
   {
     sm <- match(x, x)
     x <- x[sm == seq_along(sm)]
     m <- match(x, y)
     x[!is.na(m)]
   }

   setequal <- function(x, y)
   {
     !(anyNA(match(x, y)) || anyNA(match(x, y)))
   }

and as long as your objects support [, c(), and match(), then the set
operations will work out-of-the-box on them. Note that you would also
get %in% for free.

There might be some rare situations where it might still be useful
that the set operations are generic functions but I see a lot more
value in making match() itself a generic (which doesn't exclude also
making the set operations generic).

For the record, match(), union(), intersect(), and setdiff() are S4
generics in the BiocGenerics package. But there is no doubt it would
be a better/cleaner situation if base::match() itself was an S4 generic
or primitive.

My 2 cents,

Cheers,
H.