Skip to content

Attributes of top level environments clobbered (was Re: [R] possible bug in function 'var' in R 2.7.2?)

6 messages · Gabor Grothendieck, Luke Tierney

#
On Fri, Oct 3, 2008 at 3:23 AM, Martin Maechler
<maechler at stat.math.ethz.ch> wrote:
The bug discussed in the following year-old post suggested that
the problem of clobbering attributes of top level environment objects
would be fixed for 2.7 but its still in R version 2.7.2 (2008-08-25)"
and also still in  "R version 2.8.0 alpha (2008-10-01 r46589)"

   https://stat.ethz.ch/pipermail/r-devel/2007-October/047184.html

The Avoiding R Bugs section of this page:

   http://r-proto.googlecode.com

has more discussion as well as a list of some other R bugs.

This can be tested by creating a package with these two files only:

---DESCRIPTION---
Package: testlazy
Version: 1.0-0
Date: 2008-10-03
Title: Test lazy loading
Author: G Grothendieck
Maintainer: G Grothendieck <ggrothendieck at gmail.com>
Description: Test lazy loading with top level objects.
Depends: proto
LazyLoad: yes
License: GPL-2
---R/testlazy.R---
TopLevel <- proto()
---

And then testing it:

library(testlazy)
class(TopLevel)

If its class is "environment" only then the class attribute was stripped.
#
I will look into fixing it sometime if no one else feels like doing
it.  The environment aspect is not high priority; some other related
issues are more so (locking and active bindings as I recall).  But
even thoughs may not make it to the top of my queue any time soon.

The issue of placing attributes on environments has come up before,
many times.  It is routinely advised against.  For better or worse,
the way environments were exposed to the R level is not designed to
support this properly.  Fixing this so that attributes are supported
reliably is non-trivial, and it is hard to justify the effort given
that there are standard work-arounds (such as putting the environment
in a list and attaching attributes to the list).  An alternative way
of removing the issues associated with attributes on environments, in
line with the way NULL works, is to disable placing attributes on
environments, at least from the R level.  This option is looking
increasingly attractive.

luke
On Fri, 3 Oct 2008, Gabor Grothendieck wrote:

            

  
    
#
On Fri, Oct 3, 2008 at 11:43 AM, Luke Tierney <luke at stat.uiowa.edu> wrote:
These are not good options:

- the workaround does not allow one to inherit methods.  This
implies tediously rewriting or writing wrappers every inherited method
in any such subclass.  Its really tantamount to eliminating OO for
environments which is not a reasonable solution for a language that
is supposed to be OO.

- eliminating attributes on environments is even worse since widely
used packages such as proto, ggplot2 and other packages would suffer.
Maintaining reasonable compatiblity should be a goal of the core
development.  It would be better to document the current situation than
to make such a retrograde change.

- if time is a problem perhaps the core group needs to add resources
to reasonably address the problems in R.  Traditional economics
do not apply to an open source project.  There is no monetary cost to
adding additional developers.
#
On Fri, 3 Oct 2008, Gabor Grothendieck wrote:

            
It would mean sealing environments, making them final, pick your
favorite OO terminology.  Standard thing to do in many OO languages
when it is wararanted.
The authors of proto would finally need to bite the bullet and address
this issue.  This would make proto more reliable in the end.

luke

  
    
#
On Fri, Oct 3, 2008 at 12:46 PM, Luke Tierney <luke at stat.uiowa.edu> wrote:
I don't think this really addresses the problem.  The S3 model is
in principle capable of handling environments and almost does
so fully now.
The fact that environment attributes get clobbered at top level
when defined in packages with lazy loading (but not outside packages
nor in packages without lazy loading) is clearly a deficiency of R,
not proto.

Furthermore, the suggstion that this deficiency in R somehow reflects
any unreliably in proto is likewise not accurate.  proto is extremely
reliable, particularly in
comparison to R.  In fact there are no known bugs in the development
version of proto and large widely used packages use proto.  (If you are aware
of any bugs let me know.) The fact that it is so solid is quite
understandable because
even if some of its code is necessarily complex the code is so short that its
readily possible to accomplish this apprent bug-free status with reasonble
effort.  Furthermore, the proto home page documents the problems -- mostly
problems with R itself, not proto.

I do appreciate the excellent R software; however, there are a few points like
those addressed on the proto home page which do need to be addressed in R
for it to be fully functional.
#
On Fri, 3 Oct 2008, Gabor Grothendieck wrote:

            
Depends on your definitions.  Dispatch may work, but for me the show
stopper is unclass and related things.  Use of unclass inside code is
a fairly common idiom, and whenever an environment with a class gets
into a function that does an unclass on it, the environment ceases to
have a class.  You can argue that this should not be so but it is so,
it has been so for a long time, and it is lileky to remain so for a
long time.  To me this means that putting classes on environments is
too brittle to use in production code.  For this reason I would not use
a package that relies on doing this and would not encourage anyone
else to at this point.  I like prototype based OO, expecially for
graphics, which is why I wrote such a system for Lisp-Stat many years
ago. But using a variant of such a system in which the class can so
easily disappear seems ill-advised to me.
While these packages may be widely used their authors have disregarded
repeated advice on this.  Inheritance just does not work reliably for
environments. Encapsulation does work. (One could argue that
encapsulation would be a better choice even if reliable inheritance
was available, but that is a moot point).
There is a bug in lazy loading in that some important features of
environments are not preserved.  Given that attributes on environments
are not reliable however I would argue that the fact that they are not
preserved is not particularly important.  While fixing the more
important bugs it is probably not hard to fix the attribute issue as
well (that is a conditional statement -- fixing the more imortant bugs
is going to be fairly painful, which is why it hasn't happened yet.)
Once that is done, lazy loading may "work" for proto but using
attributes on environments is still a Bad Idea given the way R works.
There are some interesting poins on that page that are worth looking
into.  Over time I suspect all but the current 3. will be addressed,
but 3., which is a variant on the unclass issue, is not likely to be.
You can call this a deficiency in R if you like, and I would agree in
the sense that I think it is inappropriate to allow attributes to be
set but not in a reliable way because they can be inadvertenly
removed.  We should have done this differently.  THere were/are two
choices:

     Make reference values, including environments, special in that they
     may not have attributes. This woud have been fairly easy (modulo one
     use made in decorating the frames on the search path) and could be
     done now to clean things up.

     Make R-visible environments in two parts--a wrapper that is passed by
     value like standard R objects and could have attributes, and an
     internal part that is essentially the current environment object.
     This is analogous to the way that character vectors, even of length 1,
     consist of an STRSXP wrapper containing CHARSXPs that hold the string.
     The STRSXP's are visible at the R level, the CHARSXPs are not.  This
     would have been messier to implement, and unfortunately would be very
     messy to retro-fit at this point, so it isn't likely to happen unless
     there is some other compelling reason to do so.

The bottom line is that this situation isn't likely to change any time
soon as far as I can see.  If that means that for you R will not be
"fully functional" then so be it.  Attributes on environments are not
reliable and hence it is a Bad Idea to try to use them.  This is a
feature of R as it is now, has been for a while, and will be for a
while. If you write code for language X, you can write it for
X-as-it-s or X-as-you-wish-it-to-be; but if you chose
X-as-you-wish-it-to-be and find things don't work out it's hard to
argue that the fault is with X.

luke