Digest package - make digest generic?
On 10/15/07, Henrik Bengtsson <hb at maths.lth.se> wrote:
[As agreed, CC:ing r-devel since others might be interested in this as well.] Hi. On 10/15/07, Dirk Eddelbuettel <edd at debian.org> wrote:
Hi Hadley, On 15 October 2007 at 09:51, hadley wickham wrote: | Would you consider making digest a generic function? That way I could | (e.g.) make a generic method for ggplot objects which didn't depend | (so much) on their internal representation. Well, generally speaking, I always take patches :)
I see know problems in doing this. The patch would be:
digest <- function(...) UseMethod("digest");
digest.default <- <current digest function>.
I think that should do, and I don't think it has any surprising side
effects so it could be added in the next release. Dirk, can you do
that?
I have to admit that I am fairly weak on these aspects of the S language. One question is: how to the current users of digest (i.e. Henrik's and Seth's caching mechanism, for example) use it on arbitrary objects _without_ it being generic?
I basically put everything I want into a list() and pass that to digest::digest().
Yes, that's what I'm doing too.
| The reason I ask is that I'm using digest as a way of coming up with a | unique file name for each example graphic. I want to be able to | easily compare the appearance of examples between versions, but | currently the digest depends on internal details, so it's hard to | match up graphics between versions.
See loadCache(key) and saveCache(object, key) in R.cache, which basically loads and saves results from and to a file cache based on a key object - no need to specify paths or filenames. You can specify paths etc if you want to, but by default it is just transparent.
The problem is I need to refer to the image from the documentation, so I do need to know it's path. I also want to be able to look at the image, so if the digests are different I can see what the difference is (I'm planning to automate this with the imagemagick compare command line tool).
However, I think Hadley is referring to a different problem. Basically, he got an object containing a lot of fields, but for his purposes it is only a subset of the fields that he wants to use to generate a consistent the hashcode. If he pass any other field, that
Yes, exactly.
will break the consistency. In that case, the designer of the class has to identify the fields that makes uniquely identify the state of the object. I do that for many of my object and pass them down in a list() structure to digest(). I agree, by making digest() generic, one can make the code nicer. [If there is a need to dispatch on multiple arguments, we have to go for S4, but otherwise S3 gives the minimal modification]. Side comment: This basically comes down to how for instance Java deals with hashCode() and equals() etc. By default the object as is used to generate the hashcode (and can be used by equals() compare objects).
Yes, that's the model I was thinking of too. Hadley