Skip to content

proto and baseenv()

26 messages · Thomas Petzoldt, Ben, Hadley Wickham +3 more

Messages 1–25 of 26

#
I understand why the following happens ($.proto delegates to get,
which ascends the parent hierarchy up to globalenv()), but I still
find it anti-intuitive:

  > z <- 1
  > y <- proto(a=2)
  > y$z
  [1] 1

Although this is well-documented behavior; wouldn't it uphold the
principle of least surprise to inherit instead from baseenv() or
emptyenv()? (See attached patch.)

Spurious parent definitions have already been the source of bizarre
bugs, prompting me to use proto like this:

  > z <- 1
  > y <- proto(baseenv(), a=2)
  > y$z
  Error in get(x, env = this, inherits = inh) : object 'z' not found

It's cumbersome, but it ensures that parent definitions don't pollute
my object space; I would rather that be the default behaviour, though.
Ben
#
Wow, thanks for the heads-up.  That is horrible behavior.  But using
baseenv() doesn't seem like the solution either.  I'm new to proto,
but it seems like this is also a big drawback:
Error in eval(expr, envir, enclos) : object "z" not found
#
That is how R works with free variables, e.g.

z <- 1
f <- function() z
f() # 1

so the current behavior seems consistent with the rest of R.

Note that the example below could be done like this to avoid the error:
[1] 1
On Thu, Feb 25, 2010 at 12:33 AM, Ben <misc7 at emerose.org> wrote:
#
Am 25.02.2010 06:33, wrote Ben:
I would say that this behaviour is intentional and not "horrible". proto 
objects do simply the same as ordinary functions in R that have also 
full access to variables and functions at a higher level:

Try the following:

 > y <- proto(a=2)
 > y$ls()
[1] "a"


ls() is defined in package base and so would even work if you inherit 
from baseenv() so why it is surprising that proto objects (by default) 
inherit objects from other packages and from the user workspace?


Thomas P.
Ben
#
I was disappointed in this behavior because it seems error-prone.
Suppose I declare an environment
a <- 2
+   ...
+ })
[1] 2
[1] 1

Presumably if I ask for p$a or p$b later, it's because I'm interesting
in the value of "p$a" or "p$b" that I specifically put inside that
environment.  Otherwise I would just ask for "a" or "b".  If I'm
asking for "p$b" it the above case, that means I forgot to declare b
inside p.  In this case there should be an error telling me that, not
a silent substitution of the wrong quantity.

If someone wanted to do the y$ls() thing, they could always
[1] "a"

Another reason is that there are plenty of other programming languages
that have similar structures and this behavior is very odd.  In
standard languages asking for "b" inside the "p" object gives you an
error, and no one complains.  Even in R, we have this behavior:
NULL

(Actually I think the above should be an error, but at least it isn't
1.)  So anyway, I'm not saying that p$b being 1 is an outright 2+2=5
bug, but it does seem to be surprising behavior that leads to bugs.
But I'm sure you're right that there are historical/structural reasons
for this to be the case, so perhaps there's no solution.
#
On 25/02/2010 8:50 AM, Ben wrote:
I think you are looking for a different object model than proto offers.  
There aren't many languages that offer the prototype object model.
Which languages are you thinking of?  The paper about proto only 
mentions Self, LispStat, and Javascript. The Wikipedia page 
http://en.wikipedia.org/wiki/Prototype-based_programming has a long list 
of obscure ones, but the only additional ones I've used are R, Perl and 
Tcl, all with add-on packages.

Duncan Murdoch
#
You might want to have a look at the mutatr package which implements
prototype OO with the behaviour you're looking for (I think).  I also
have a paper available if you email me off list.

Hadley
#
On Thu, Feb 25, 2010 at 9:57 AM, hadley wickham <h.wickham at gmail.com> wrote:
I think his objection is really only that he must specify baseenv() to
get the behavior he prefers but from what I understand (I am basing
this on my understanding that mutatr works like io language) in mutatr
one must use Object which seems really no different.
#
Quoth hadley wickham on Sweetmorn, the 56th of Chaos:
Hey, Hadley; I remember your prototype stuff from DSC 2009. I saw some
slides of yours that discussed proto, and was wondering whether mutatr
was still maintained.

Would you mind sending me that paper off-list, if necessary?
#
Quoth Gabor Grothendieck on Sweetmorn, the 56th of Chaos:
Which is exactly what I started doing in proto: declaring a base
prototype (`object') that inherits from baseenv(); whence my
subprototypes spring.
#
Yes, it's maintained, and I've even written a package that uses it
(testthat - both it and mutatr are available on cran).  It's still
mainly an experiment, and I don't currently have any plans to give up
using proto.  However, my entirely personal perspective is that my
code seems to be cleaner when I use mutatr instead of proto (this
probably reflects that I wrote the package though!).

Hadley
Ben
#
Yes, your probably right---I don't have much experience using the
prototype model.  This is the way I expected it to work:
[1] 1
Error in get(x, env = this, inherits = inh) : variable "z" was not found

Also it seems it would lead to fewer bugs if it worked that way.
(Peter Danenberg mentions he's run into bugs because of this, and I
can see why.)  But as I mentioned I'm new to prototype programming.
If it worked like in my snippet, would this lead to less effective
prototype programming?


Thanks,
#
On Thu, Feb 25, 2010 at 7:49 PM, Ben <misc7 at emerose.org> wrote:
+   self$a <- z
+ })
[1] 1
Error: Field z not found in Object <0x1022b66d8>

Hadley
#
Which is exactly how it should work! Namespace pollution is orthogonal
to the specific object model, and Duncan's assertion about the
prevalence of prototypes is a red herring.

Here's how it works in Scheme, for instance, using Neil van Dyke's
Prototype-Delegation package:[1]

  #;1> (use protobj)
  #;2> (define z 1)
  #;3> (define a (%))
  #;4> (! a a z)
  #;5> (? a a)
  1
  #;6> (? a z)

  Error: Object has no such slot:
  #<object>
  z

Just like one would expect! And since protobj is based on self,
Ungar-Smith's original prototype system,[2] I suspect that self
behaves similarly.

Footnotes: 
[1]  http://www.neilvandyke.org/protobj/

[2]  http://research.sun.com/self/papers/self-power.html
#
On Thu, Feb 25, 2010 at 9:23 PM, Peter Danenberg <pcd at roxygen.org> wrote:
One would not expect the behavior you cite if you were working in R,
only if you were more familiar with something else.  One can just as
easily argue that consistency and flexibility are more important
principles.   Your preference is inconsistent with how the rest of R
works and is inflexible since everything inherits from Object.  In
contrast proto uses R-like behavior as its default but is flexible
enough that your preference can be accommodated and done so compactly.

Also I think your argument is based partly on repeating the original
erroneous (relative to the writer's intention) proto code without
repeating my correction confusing the discussion with simple user
error.
#
Really? Here are a couple of counterexamples in S3 and S4 objects:

  > z <- 1
  >
  > ## S4
  > setClass('A', representation(a='numeric'))
  [1] "A"
  > a <- new('A', a=z)
  > a at z
  Error: no slot of name "z" for this object of class "A"
  >
  > ## S3
  > a <- structure(list(a=z), class='A')
  > a$z
  NULL

As far as flexibility is concerned: keep the ability of people to
inherit from the parent environment if they don't mind
namespace-pollution and unpredictability.

I'm merely asking that the default behavior resemble the twenty-three
years of precedent since Ungar's original Self paper.
I acknowledged your correction in an earlier email when I stated that,
"[one has] to choose between eval and parent pollution."
#
On Thu, Feb 25, 2010 at 11:16 PM, Peter Danenberg <pcd at roxygen.org> wrote:
lists don't have inheritance at all so the last example seems not
relevant.  Also other object systems which are alternatives to proto
seem less relevant than basic scoping and free variable lookup in
functions.  Those are the relevant principles.
proto does that but uses the consistent default rather than the
inconsistent default that you prefer.

The namespace advantage is an advantage but its lesser than
consistency with R and the flexibility to have it either way.
That is just what I (and possibly Duncan) have argued.  That your
expectation is based on different systems.
But the last email repeated the wrong code anyways and used that as
the springboard for the discussion.
#
Sorry, but that seems absurd; object systems are less relevant to each
other than the totally orthogonal question of scope?
$.proto falls back upon get(), I presume, because the authors didn't
feel like recursing up the parent hierarchy themselves; I'll continue
to believe that the scope pollution was an oversight until they
contradict me. At which point I'll probably switch object systems.

Vague appeals to consistency, when you're really only talking about
naive get() semantics, don't mean much; especially after you've spent
hours debugging mysterious bugs resulting from scope pollution.
#
On Fri, Feb 26, 2010 at 12:41 AM, Peter Danenberg <pcd at roxygen.org> wrote:
Yes, if you are using one then obviously you have decided to use it in
place of the other.  Also your example of S3 was misleading since it
used lists which do not have inheritance and did not truly illustrate
S3.  Free variables in S3 methods follow the same lookup procedure as
ordinary functions and using S3 together with proto works well.  In
fact, proto uses two S3 classes.
Here I think you are admitting that the basic facilities of R do work
in the way proto does.

Also, your alternative likely would be unusable due to performance
whereas proto is fast enough to be usable (see list of applications
that use it at http://r-proto.googlecode.com/#Applications).  Its not
as fast as S3 (though sometimes you can get it that fast by optimizing
your code).  The development version of proto is even faster than the
current version of proto due to the addition of lazy evaluation.
In end it seems that your real beef is with R so perhaps you should be
using a different language.

With respect to proto its really just discussing whether to use

proto(baseenv(), ...) vs proto(...)

since the former gives you everything you want and the distinction
seems pretty trivial given how easy it is to use one or the other.  If
you used iolanguage or similar you would have to specify Object so
there is not even a penalty in terms of compactness.

There have also been threads on how to "fix" scoping in R for
individual functions and various manipulations of environments have
been suggested but in the end no one does this in practice.  In proto
at least you can do the part you want and its trivial to do so.
#
On 26/02/2010 7:09 AM, Gabor Grothendieck wrote:
I would say the default behaviour makes more sense.  There are very few 
circumstances in R where scoping makes locals and base variables 
visible, *but nothing else*.  I think users would be surprised that

a$ls()

works, but

a$str()

fails (because str() is in utils, ls() is in base).  Effectively what 
the current default says is that objects inherit from the current 
environment.  That's how environments work in R.

One thing that I dislike about scoping in R is the fact that even in a 
namespace, searches eventually get to the global environment.  I'd 
prefer if namespace searches went through the imported namespaces and 
nothing else.  If that were the case, then the a$z example would never 
find a z in the global environment, and that example would only be a 
problem for people fiddling around in the console, not programming 
carefully in a package.  But again, this is a criticism of R, not of 
proto.  (I believe the reason for the current behaviour is to support 
finding S3 methods:  a user should be able to define a new class and an 
S3 method for it, and R should find it even if the writer of the 
original generic knew nothing about the new class or method.  This 
outside-the-namespace search is needed for that, but I'd prefer it if it 
were more limited.)

Duncan Murdoch
Ben
#
In my case you may be right.  I do think there are a million things
wrong with R.  For instance, I was looking for a package that
overcomes two of the problems R IMHO has: namespace pollution and the
lack of an easy-to-use standard object system.

Should I be using R?  I do keep asking myself that same question...
Unfortunately this doesn't fix the problem as was noted earlier:
Error in eval(expr, envir, enclos) : object "z" not found
This make sense to me.
#
On Fri, Feb 26, 2010 at 9:01 AM, Ben <misc7 at emerose.org> wrote:
As already mentioned lets not confuse user error with actual problems
pertaining to proto and R.  It should have been written like this if
that is what was wanted:
[1] 1
Ben
#
Maybe I'm still not getting something fundamental, but I didn't intend
my "proto(baseenv(), expr={a <- z})$a" example to be realistic.  In
practice "a <- z" would be replaced with hundreds of lines of code,
where many functions are called.  In theory you could track down every
function that's global or from another package and track them down,
but then you would have to put dozens of extra lines of boilerplate.
It's actually worse than that, as this example shows:
Error in get("f", env = proto(baseenv(), f = function(.) sd(1:3)), inherits = TRUE)(proto(baseenv(),  : 
  could not find function "sd"
Error in sd(1:3) : could not find function "var"

Not only would every external function have to be specifically
declared with a separate argument, even unused functions may need to
be declared.  That means any change in the implementation of an
external function could break this code.

Again, I may be missing something since I'm new to proto, but I don't
see why you're dismissing this example as "user error".
#
On Fri, Feb 26, 2010 at 8:46 PM, Ben <misc7 at emerose.org> wrote:
I think you are missing the R search path.  Try this:

search()

That shows you the search path.  Normally it starts searching at the
beginning and moves forward.
That's because sd is in stats, not in base and you told it to start
searching at the end of the search path rather than the beginning.
Try this:
[1] 1
#
Added one other comment below.

On Fri, Feb 26, 2010 at 9:41 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
or if you just want to exlude the global environment but still include
any loaded packages:

Object <- as.environment(2)

(If you have not loaded any packages then the two Object<- statements
have the same effect but if there are loaded packages the second
includes them while the first excludes them.)