Skip to content

modifying a persistent (reference) class

7 messages · Brian Lee Yung Rowe, Ross Boylan, Gábor Csárdi +1 more

#
I saved objects that were defined using several reference classes.
Later I modified the definition of reference classes a bit, creating new
functions and deleting old ones.  The total number of functions did not
change.  When I read them back I could only access some of the original
data.

I asked on the user list and someone suggested sticking with the old
class definitions, creating new classes, reading in the old data, and
converting it to the new classes.  This would be awkward (I want the
"new" classes to have the same name as the "old" ones), and I can
probably just leave the old definitions and define the new functions I
need outside of the reference classes.

Are there any better alternatives?

On reflection, it's a little surprising that changing the code for a
reference class makes any difference to an existing instance, since all
the function definitions seem to be attached to the instance.  One
problem I've had in the past was precisely that redefining a method in a
reference class did not change the behavior of existing instances.  So
I've tried to follow the advice to keep the methods light-weight.

In this case I was trying to move from a show method (that just printed)
to a summary method that returned a summary object.  So I wanted to add
a summary method and redefine the show to call summary in the base
class, removing all the subclass definitions of show.

Regular S4 classes are obviously not as sensitive since they usually
don't include the functions that operate on them, but I suppose if you
changed the slots you'd be in similar trouble.

Some systems keep track of versions of class definitions and allow one
to write code to migrate old to new forms automatically when the data
are read in.  Does R have anything like that?

The system on which I encountered the problems was running R 2.15.
#
On Fri, 2014-08-01 at 14:42 -0400, Brian Lee Yung Rowe wrote:
My recollection is that in Gemstone's smalltalk database you can define
methods associated with a class that describe how to change an instance
from one version to another.  You also have the choice of upgrading all
persistent objects at once or doing so lazily, i.e., as they are
retrieved.

The brittleness of the representation depends partly on the details.  If
a class has 2 slots, a and b, and the only thing on disk is the contents
of a and the contents of b, almost any change will screw things up.
However, if the slot name is persisted with the instance it's much
easier to reconstruct the instance of the class changes (if slot c is
added and not on disk, set it to nil; if b is removed, throw it out when
reading from disk).  Once could also persist the class definition, or
key elements of it, with individual instances referring to the
definition.

I don't know which, if any of these strategies, R uses for reference or
other classes.
Arguably :)  As I said, some representations could do this
automatically.  And there are still issues such as a change in the type
of a slot, or rules for filling new slots, that would require
intervention.

In my experience with other object systems, usually methods are
attributes of the class.  For R reference classes they appear to be
attributes of the instance, potentially modifiable on a per-instance
basis.

Ross
#
On Fri, 2014-08-01 at 16:06 -0400, Brian Lee Yung Rowe wrote:
In smalltalk everything is an object, and that includes functions,
including class methods.
My immediate problem is/was that I have serialized objects representing
weeks of CPU time.  I have to work with them, not some other
representation they might have.  And it's much more natural to work with
R's native persistence than some other scheme I cook up.

I think persistence requires serialization.  The serialization can be
more or less brittle, but I don't think there is an alternative to
serialization.

Since I just worked around my immediate problem a few minutes ago (by
retaining the original class definitions and using setMethod to create
summary methods), my interests are a bit more theoretical.

First, I'd like to understand more about exactly what is saved to disk
for reference and other classes, in particular how much meta-information
they contain.  And my mental model for reference class persistence is
clearly wrong, because in that model instances based on old definitions
come back intact (albeit not with the new method definitions or other
new slots), whereas mine seemed to come back damaged.

Second, I'm still hoping for some elegant way around this problem (how
to redefine classes and still use saved versions from older definitions)
for the future, both with reference and regular classes.  Or at least
some rules about what changes, if any, are safe to make in class
definitions after an instance has been persisted.
Third, if changes to R could make things better, I'm hoping some
developers might take them up.  I realize that is unlikely to happen,
for many good reasons, but I can still hope :)

Ross
#
On Fri, Aug 1, 2014 at 4:47 PM, Ross Boylan <ross at biostat.ucsf.edu> wrote:
[...]
I believe that the brand new R6 class system can do this. I mean your
saved instances from old classes will be read back properly, with the
old methods. They are on CRAN and also here if you want to experiment:
https://github.com/wch/R6

Best,
Gabor
#
R6 objects are basically just environments, so they're probably pretty
simple to save and restore (I haven't tested it out, though).

-Winston
On Fri, Aug 1, 2014 at 4:00 PM, G?bor Cs?rdi <csardi.gabor at gmail.com> wrote: