Skip to content

attributes of environments

24 messages · Thomas Lumley, Henrik Bengtsson, Simon Urbanek +3 more

#
In the code below, e is an environment which we copy to f and then
add attributes to e.  Now f winds up with the same attributes.

In other words it seems that the attributes are a property of the
environment itself and not of the variable.  Thus it appears we
cannot have two environment variables that correspond to the
original environment but with different attributes.

I can understand if we changed a component of e then
f would reflect that too but I am not sure that this is also
desirable for attributes as they are not "in" the environment.

Is that desirable?  Is it a bug?  No other class works that way
AFAIK.   Comments?
<environment: 0x01a577f0>
attr(,"X")
[1] "Y"
[1] "R version 2.4.0 Under development (unstable) (2006-07-04 r38480)"
#
On 7/4/2006 11:12 PM, Gabor Grothendieck wrote:
I'm not sure about whether this is desirable or a bug, but environments 
are special, in that they are among the very few objects treated as 
references.  In your example, adding a variable to e will also make it 
visible in f.

Duncan Murdoch
#
On Tue, 4 Jul 2006, Gabor Grothendieck wrote:

            
No, we can't. The two variables are references to the same environment, so 
they are the same.

If you want the attributes to be copies rather than references then create 
a list with the environment as an element and put the attributes on the 
list.

 	-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle
#
On 7/5/06, Thomas Lumley <tlumley at u.washington.edu> wrote:
I realize that that is how it works but what I was really wondering was
should it work that way?
#
On Wed, 5 Jul 2006, Gabor Grothendieck wrote:

            
I think it really should (and this question has come up before).  If you 
do
    e<-environment()
    f<-e

then there is only one object that f and e both point to. Now, since such 
things as S3 class and matrix dimension are implemented as attributes I 
think you really have to consider the attributes as part of the object 
[which is also how they are implemented, of course].  So if e and f are 
the same object they should have the same attributes.

Another reasonable position would be to disallow attributes on 
environments (as we do with NULL, another reference object), but that 
seems extreme.

 	-thomas
#
On 7/5/06, Thomas Lumley <tlumley at u.washington.edu> wrote:
I don't think this follows since in the other cases modifying the
object also creates a copy.
I don't think that that would solve it because there is still the issue
of the class attribute which you can't disallow.

In fact consider this:

e <- new.env()
f <- e
class(f) <- c("myenv", "environment")
F <- function(x) UseMethod("F")
F.environment <- function(x) 1
F.myenv <- function(x) 2
F(e) # 2
F(f) # 2

The point is that subclassing does not work properly with environments
yet subclasses of the environment class should be possible since R
is supposed to be OO and I see no valid reason for exclusing environments
for this.  I guess in this discussion I am coming to the realization that
this issue really is a problem with the current way R works.
#
On 7/5/2006 12:39 PM, Gabor Grothendieck wrote:
I don't think subclassing is important here.  The issue is that both e 
and f are references to the same object.  As such, changing the class of 
f will change the class of e, so what you're seeing is what you should 
expect.

You could fairly easily create f as a new environment, and copy all of 
the contents of e and its attributes over to f, and then you'd get the 
  behaviour you're expecting.  But that doesn't give sensible semantics 
for environments in general:  for example,

assign("x", 1, envir = e)

had better make the assignment into e, not into a copy of e.  Similarly,

 > f <- function() {
+   a <- 0
+   local1 <- function() a <<- 1
+   local2 <- function() a
+   list(f1=local1, f2=local2)
+ }
 >
 > fns <- f()
 > fns$f1()
 > fns$f2()
[1] 1

would not work properly if each of local1 and local2 just had copies of 
the evaluation environment of f.  They need references to the same 
environment, so the assignment in local1 will be seen in the environment 
inherited by local2.

Duncan Murdoch
#
On Wed, 5 Jul 2006, Gabor Grothendieck wrote:

            
In cases other than environments, NULL, external pointers and weak 
references a new object is (from the language definition point of view) 
created on assignment. The fact that sometimes the actual memory 
allocation is deferred is an implementation issue.

That is
   e <- 2
   f <- e

creates two different vectors of length 1, so of course they can have 
different attributes.

For environments (and for NULL, external pointers, and weak references), 
assignment does not create a new object. It creates another reference to 
the same object.  Since it is the same object, it is the same: attributes, 
values, class, etc.
Of course you can. It might be inconvenient, but it's not difficult.
No, subclassing *does* work. f and e are the same object, so they have the 
same class, c("myenv", "environment"). The thing that doesn't "work" the 
way you expect is assignment:
    f <- e
doesn't create a new object, as it would for any other sort of object.
It really is the way R is designed to work. Whether it is a problem or not 
is a separate issue. Environments really are references, not values, and 
they really work differently from the way most other objects work.

 	-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle
#
On 7/5/06, Thomas Lumley <tlumley at u.washington.edu> wrote:
What I meant is that if one wants R to be consistently oo
then you can't disallow them.

Objects are supposed to have classes and subclasses should be readily
definable.

I see no good reason for excluding environments from this.
Just because it does not create a new environment does not mean
that attr(f, "X") <- "Y"
could not create a new variable f that also points to e.
OK.  Its not a bug but as we discuss this it seems to me that
its current operation is undesirable since
environments don't seem to fit into the scheme that other objects do
yet different design/implement would allow this to occur.
#
On 7/5/06, Thomas Lumley <tlumley at u.washington.edu> wrote:
Yes, it about how you *try* to look at it and not how it was designed.
 I was also bitten by this a few years ago.  There are two ways to
think of 'e' and 'f' above:

 1) 'e' and 'f' are R objects (reference variables) that refers to a
common object (the environment), and applying attr(e, name) <- value
applies to the R reference object 'e' and attr(f, name) <- value
applies to the 'f' object.

 2) 'e' and 'f' are references to a common object (the environment)
and attr(e, name) <- value applies to the common environment object
and so do attr(f, name) <- value too.

Case 1) is wrong and Case 2) is correct.

If you prefer Case 1) you can wrap up your environment in a list, e.g.

  e <- list(env=new.env())
  f <- list(env=e$env)

This makes 'e' and 'f' two different object that can have different
attributes, but both e$env and f$env refers to the same environment
object.  This is the basics of the Object class in the R.oo packages.

Cheers

Henrik
#
Gabor,
On Jul 5, 2006, at 1:16 PM, Gabor Grothendieck wrote:

            
We discuss it only because *you* think it's undesirable...
Environments are different *on purpose*, what environments do cannot  
be achieved using any other 'standard' object. And it's exactly  
environment's behavior on assign that makes it useful, so what you  
are proposing is basically making it into a list (so that it gets  
copied on assign), which makes no sense. What you really want is  
something other than an environment, but you insist on using an  
environment - it's like insisting on using a screwdriver on a nail -  
it's not the screwdriver's fault that it doesn't work ...

.. and since you pounding on OO - environments are the closest you  
can get to an object semantics as implemented in the most popular OO  
languages, so I wonder why you aren't arguing to make all objects  
into references ;).

Cheers,
Simon
#
On 7/5/06, Henrik Bengtsson <hb at stat.berkeley.edu> wrote:
I think what you mean is that case 2 is how it works but from
the viewpoint of desirability in terms of design, case 2 seems
inconsistent with oo principles since subclassing does not work
properly.
I realize that this is how a number of packages, such as tcltk, work
to circumvent the design problem but what one really wants is the
ability to define a subclass of environment directly -- not create a new
class and then define a subclass of that new class.  In particular, one wants
all the existing environment methods to work on the subclass
and they don't unless you redefine them all.  OO is about reusing
existing functionality via inheritance.
#
On 7/5/06, Simon Urbanek <simon.urbanek at r-project.org> wrote:
I don't think ad hominem arguments and unsupported statements that
things "make no sense" or analogies to screwdrivers have any relevance
to this discussion.  I think by this time I have shown that subclassing of
environments does not work yet it could if it were designed differently
and furthermore there are significant problems with the workarounds.
#
On Jul 5, 2006, at 2:23 PM, Gabor Grothendieck wrote:

            
I was hoping that it will help you understand the point, apparently I  
was wrong.
No, you didn't. Your example demonstrates that subclassing works  
correctly and exactly as expected.

Cheers,
Simon
#
On 7/5/2006 2:23 PM, Gabor Grothendieck wrote:
Gabor, I think Simon misread what you wrote above (taking "as we discuss 
this" to mean "because we discuss this", rather than "during our 
discussion of this"), and you misread his reply.  This doesn't look to 
me like an ad hominem.
You have ignored my explanation of why things are the way they are. 
Simon's statement is not unsupported in the context of the complete 
discussion.

  I think by this time I have shown that subclassing of
I don't think you've shown that subclassing of environments doesn't 
work.  You have an example that shows that shows that R implements 
Henrik's "Case 2" rather than his "Case 1", but as Thomas and I said, 
that really has nothing to do with subclassing.

Subclassing is about defining a new class, not about copying objects. 
You can (and did!) define a new class which inherits from the 
environment class.

There may be problems in UseMethod or NextMethod when the object is an 
environment; if there are, then you're right.  But so far you haven't 
shown that.

Duncan Murdoch
#
On 7/5/06, Simon Urbanek <simon.urbanek at r-project.org> wrote:
I have repeatedly seen people claiming in relation to various topics regarding
R that simply being consistent with the documentation makes it "correct".
Some of these discussions have, in the past, bordered on ridiculous since
everyone who encounters these problems knows there are problems with
the design.

The discussion here is that the current design has an undesirable property
and that an alternate design would remove that problem.
#
On 7/5/06, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
But by subclassing in the way allowed one comes up with something that
is not useful.

That is why tcltk and Henrik's package wrap environments in lists and define
a completely different class but by doing that they are not able to take
advantage of inheritance.
#
On 7/5/2006 3:47 PM, Gabor Grothendieck wrote:
You haven't shown that.  Show an example where you define a new class 
that should inherit from environment but doesn't.

All you've shown so far is that when you try to change the class of an 
object to a new class, it appears that the class of another object also 
changes.  (The explanation being that they are really just different 
names for the same object.)
I think they did that because they wanted explicit references to 
objects, rather than the built-in implicit ones.  I've wanted explicit 
references to things on a number of occasions too, but that's really 
unrelated to inheritance as far as I can see.

Duncan Murdoch
#
On 7/5/06, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
But that is not how oo works.  When one defines a child its a
delta to the parent.  It does not change the parent.

Your parenthesized statement discussed why it works that way
under the current design but that is not inevitable.  The current
design is not the only possibility.
They are defining environments with special features yet they can't make
use of inheritance as they injected the environment object into their object
rather than subclassing it -- understandable given the current limitations.
#
On 7/5/2006 4:33 PM, Gabor Grothendieck wrote:
There are dozens of different definitions of "object oriented", but 
generally in the ones I know about, subclassing is something you do to a 
class, not to an object.  (In some languages classes are objects, but 
not all objects are classes.)

It is possible to have an object with class c('myenv', 'environment'). 
As far as I know, methods applied to that object will dispatch to the 
myenv method if one is defined, or to the environment method or default 
method if not.  That's exactly how things should work, and that's how 
they worked in the example you showed.

Because environments have unusual semantics, it wouldn't surprise me 
very much if there were some errors in the implementation of UseMethod 
or NextMethod.  If there are, then you'd have a valid complaint.  But so 
far you've just made an unsupported claim.
I think their worry was that attaching the special features to the 
environment would leave those features at risk of being thrown away by 
some other code that attached its own features to that environment.  But 
this has nothing to do with subclassing, it has everything to do with 
the semantics of references.

If you want to complain about the semantics of references in R, do that, 
but don't bring up the red herring of subclassing (unless you really 
have code that demonstrates that CallMethod or NextMethod don't work as 
expected.)

Duncan Murdoch
#
On 7/5/06, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
Perhaps the example I gave previously does not adequately
convey the problem.  Lets try another example.

#1 uses R as it currently exists.

1. Define an environment e1 with component A.  Now
suppose we wish to define a "subobject" for which
$ is overridden so that it will upper case the
right arg.  We can do this:

e1 <- new.env()
e1$A <- 1

# now define a "subobject" and override $
e2 <- structure(list(env = e1), class = c("myenvironment", "environment"))
"$.myenvironment" <- function(obj, x) obj[["env"]][[toupper(x)]]

e2$a # 1
e2[["A"]]  # NULL since e2 cannot usefully inherit [[

To really get this to work we would have to
redefine all of environment's methods.  I won't do that
here to keep things small.


2. However, if it were possible to have the class
attached to e2 rather than being attached to the environment itself then
e2 could still inherit all the other methods of "environment" yet
be distinct from e1 even though the two would share the same
environment:

# this assumes a attributes are associated with variables
# does not work as intended in R currently since the class(e2)<-
statement also changes e1:

e1 <- new.env()
e1$A <- 1
e2 <- e1
class(e2) <- c("myenvironment", "environment")
"$.myenvironment" <- function(obj, x) obj[[toupper(x)]]
e2$a # 1
e2[["A"]]  # 1

Now all of environment's methods can be used and e1
can still be used in the usual way.

If we typed the above into R now then e1 would be changed
too.  Thus e2 is no longer a "subobject" of e1.  It is e1.
e1$a # 1  -- oops!

Although this example may seem simple I think it represents the essence
of  the situation that exists in a number of packages.  Any package
that currently uses the list(env=...an.environment...) idiom could
likely usefully
make use of this.  Note that in #1 we had to redefine all the methods of
environment to get access to them but if the functionality assumed in
#2 existed then it would inherit them and no further work need be done.
#
On 7/5/2006 8:06 PM, Gabor Grothendieck wrote:
I think this shows one of the problems of the S3 object system.  You've 
declared e2 to be an environment, but it's not.  There's nothing in S3 
to stop you from doing that, but the code won't work, as you observed.
That's something different from what I'd expect from an object.  I don't 
expect objects to inherit from other objects, I expect the class of 
objects to inherit from the class of other objects.  This means the 
objects share properties and behaviour, but not necessarily data.

In some languages they can share data too.  In S3 inheritance is almost 
meaningless, so there's really nothing shared, except a claim that any 
method you don't override will still somehow work:  but there's no 
possible way to verify that claim.  That's why John Chambers wrote the 
S4 system.  In S4 objects a descendant class will inherit properties, 
and it will inherite behaviour to some extent, but even S4 doesn't 
really encapsulate behaviour as much as some other object systems (e.g. 
Java) do.
But you can still do some of what you want as follows.

Instead of saying e2 <- e1, say e2 <- copy(e1), where copy is a function 
you write that copies the content and attributes of e1.  Changes to e2 
won't affect e1.  Then change the class of e2, and e1 is unaffected.

If you'd like some e2 changes (i.e. content changes) to also affect e1 
but not others (i.e. attribute changes), then you're right, you can't do 
that in the current S3 system.  I'd say that's because R doesn't do 
references as well as it should.  I'd like to have a way to say a 
variable "r" is a reference to something else, so that if you do

s <- r

then s becomes a reference to the same thing.  I had a discussion last 
year with Luke Tierney about this, and he convinced me that it's a 
really good idea to distinguish references from the things they refer 
to, and it's also a good idea not to allow references to named objects.
(If you did an assignment from the thing r refers to directly to another 
variable, you'd get a copy.  Changes to the copy would have no effect on 
the thing r refers to.)

He put together an implementation of this idea using current R 
functions; the only problem was that since the parser didn't know 
anything about it, the syntax was a little clunky.

You couldn't use this to do exactly what you want unless all of R was 
rewritten so that environments were implemented as this new kind of 
reference.  Since environments play such a central role in low level 
processing in R, that's never going to happen.  But you could probably 
accomplish more of what you want to do than now.

Duncan Murdoch
#
On 7/5/06, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
With environments new possibilities become available and this sharing
of data is one of them.   From the viewpoint of S3 the inheritance is
still the usual S3 inheritance but it would allow the same environment
to be used in different objects sharing the same data.  It does not really
extend the inheritance concept with respect to data as the data in e1
cannot be overridden by e2 without changing e1.  It only allows the
methods to be overridden.  Yet it would provide a useful step without
being too drastic. The fact that existing code motivated this example
suggests that it would be useful.
If you are going to copy them you might as well use lists.
I think the idea here of separating the attributes from the .environment
is less drastic than changing or proposing a new oo system or adding a
new reference facility.  Its more incremental in nature and simply makes
the existing sharing of environments more useful by leveraging the
existing S3 inheritance system in a way which cannot (usefully) be done now.
1 day later
#
Umm, maybe we should step back a bit here.

There are two points being made, both of which I think are reasonable, 
but they just don't happen to work together.

1.  Environments are special objects in R.  In fact they are the 
essential way directly in R to implement reference semantics.  Every 
object that has an environment (say, as a slot), has the same object.  
Much of R depends on this semantic property.

2.  For any "class" in the language, it would be useful to define a 
class that extends (contains) that class, but has additional properties.

The essential reason 2 conflicts with 1 for environments is that slots 
and other properties of objects are implemented as attributes, including 
class(x).  Attributes are implemented inside the struct that holds the 
object itself.  This means you can not have an object of another class 
that contains an environment, in the same way you would with a 
non-reference object, without wiping out properties of the "contained" 
object.

You do get told this, in the  simplest case anyway.  For example, 
suppose I want a class that acts like an environment but also has a slot 
"source" for some extra information:

 > setClass("e2", representation(source="character"), contains = 
"environment")
[1] "e2"
Warning message:
class "environment" cannot be used as the data part of another class in: 
reconcilePropertiesAndPrototype(name, slots, prototype, superClasses, 

and indeed the new class doesn't work as you would expect.

So the question is whether we want to enforce that limitation, in which 
case we probably need a stronger slap on the wrist, or whether we should 
consider a different implementation for this case, to allow the new 
class to inherit the properties of an environment, without violating the 
integrity of environments.  I kind of like the second approach, but only 
if it does not overly mess with the general approach to classes.
Gabor Grothendieck wrote:

            
<etc.>