surprising behaviour of names<-
Berwin A Turlach wrote:
On Thu, 12 Mar 2009 10:53:19 +0100 Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:
well, ?'names<-' says:
"
Value:
For 'names<-', the updated object.
"
which is only partially correct, in that the value will sometimes be
an updated *copy* of the object.
But since R supposedly
*supposedly*
uses call-by-value (though we know how to circumvent that, don't we?)
we know how a lot of built-ins hack around this, don't we, and we also know that call-by-value is not really the argument passing mechanism in r.
wouldn't you always expect that a copy of the object is returned?
indeed! that's what i have said previously, no? there is still space for the smart (i mean it) copy-on-assignment behaviour, but it should not be visible to the user, in particular, not in that 'names<-' destructively modifies the object it is given when the refcount is 1. in my humble opinion, there is either a design flaw or a bug here.
And the R Language manual (ignoring for the moment that it is a
draft and all that),
since we must...
clearly states that
names(x) <- c("a","b")
is equivalent to
'*tmp*' <- x
x <- "names<-"('*tmp*', value=c("a","b"))
... and?
This seems to suggest
seems to suggest? is not the purpose of documentation to clearly, ideally beyond any doubt, specify what is to be specified?
that in this case the infix and prefix syntax is not equivalent as it does not say that
are you suggesting fortune telling from what the docs do *not* say?
names(x) <- c("a","b")
is equivalent to
x <- "names<-"(x, value=c("a","b"))
and I was commenting on the claim that the infix syntax is equivalent
to the prefix syntax.
does this say anything about what 'names<-'(...) actually
returns? updated *tmp*, or a copy of it?
Since R uses pass-by-value,
since? it doesn't!
you would expect the latter, wouldn't you?
yes, that's what i'd expect in a functional language.
If you entertain the idea that 'names<-' updates *tmp* and returns the updated *tmp*, then you believe that 'names<-' behaves in a non-standard way and should take appropriate care.
i got lost in your argumentation. i have given examples of where 'names<-' destructively modifies and returns the updated object, not a copy. what is your point here?
And the fact that a variable *tmp* is used hints to the fact that 'names<-' might have side-effect.
are you suggesting fortune telling from the fact that a variable *tmp* is used?
If 'names<-' has side effects,
then it might not be well defined with what value x ends up with if
one executes:
x <- 'names<-'(x, value=c("a","b"))
not really, unless you mean the returned object in the referential sense
(memory location) versus value conceptually. here x will obviously have
the value of the original x plus the names, *but* indeed you cannot tell
from this snippet whether after the assignment x will be the same,
though updated, object or will rather be an updated copy:
x = c(1)
x = 'names<-'(x, 'foo')
# x is the same object
x = c(1)
y = x
x = 'names<-'(x, 'foo')
# x is another object
so, as you say, it is not well defined with what object will x end up as
its value, though the value of the object visible to the user is well
defined. rewrite the above and play:
x = c(1)
y = 'names<-'(x, 'foo')
names(x)
what are the names of x? is y identical (sensu refernce) with x, is y
different (sensu reference) but indiscernible (sensu value) from x, or
is y different (sensu value) from x in that y has names and x doesn't?
This is similar to the discussion what value i should have in the following C snippet: i = 0; i += i++;
nonsense, it's a *completely* different issue. here you touch the issue
of the order of evaluation, and not of whether an object is copied or
modified; above, the inverse is true.
in fact, your example is useless because the result here is clearly
specified by the semantics (as far as i know -- prove me wrong). you
lookup i (0) and i (0) (the order does not matter here), add these
values (0), assign to i (0), and increase i (1).
i have a better example for you:
int i = 0;
i += ++i - ++i
which will give different final values for i in c (2 with gcc 4.2, 1
with gcc 3.4), c# and java (-1), perl (2) and php (1). again, this has
nothing to do with the above.
[..]
I am not sure whether R ever behaved in that way, but as Peter
pointed out, this would be quite undesirable from a memory
management and performance point of view.
why? you can still use the infix names<- with destructive semantics
to avoid copying.
I guess that would require a rewrite (or extension) of the parser. To me, Section 10.1.2 of the Language Definition manual suggests that once an expression is parsed, you cannot distinguish any more whether 'names<-' was called using infix syntax or prefix syntax.
but this must be nonsense, since:
x = 1
'names<-'(x, 'foo')
names(x)
# NULL
x = 1
names(x) <- 'foo'
names(x)
# "foo"
clearly, there is not only syntactic difference here. but it might be
that 10.1.2 does not suggest anything like what you say.
Thus, I guess you want to start a discussion with R Core whether it is worthwhile to change the parser such that it keeps track on whether a function was used with infix notation or prefix notation and to provide for most (all?) assignment operators implementations that use destructive semantics if the infix version was used and always copy if the prefix notation is used.
as i explained a few months ago, i study r to find examples of bad design. if anyone in the r core is interested in having the problems i report fixed, i'm happy to get involved in a discussion about the design and implementation. if not, i'm happy with just pointing out the issues. cheers, vQ