Confused about NAMED
On 11-11-24 6:34 AM, Matthew Dowle wrote:
On Nov 24, 2011, at 11:13 , Matthew Dowle wrote:
Hi, I expected NAMED to be 1 in all these three cases. It is for one of them, but not the other two?
R --vanilla
R version 2.14.0 (2011-10-31) Platform: i386-pc-mingw32/i386 (32-bit)
x = 1L .Internal(inspect(x)) # why NAM(2)? expected NAM(1)
@2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1
y = 1:10 .Internal(inspect(y)) # NAM(1) as expected but why different to x?
@272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
z = data.frame() .Internal(inspect(z)) # why NAM(2)? expected NAM(1)
@24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
ATTRIB:
@24fc270 02 LISTSXP g0c0 []
TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
@24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
@24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
@25be500 16 STRSXP g0c1 [] (len=1, tl=0)
@1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
It's a little difficult to search for the word "named" but I tried and
found this in R-ints :
"Note that optimizing NAMED = 1 is only effective within a primitive
(as the closure wrapper of a .Internal will set NAMED = 2 when the
promise to the argument is evaluated)"
So might it be that just looking at NAMED using .Internal(inspect()) is
setting NAMED=2? But if so, why does y have NAMED==1?
This is tricky business... I'm not quite sure I'll get it right, but let's
try
When you are assigning a constant, the value you assign is already part of
the assignment expression, so if you want to modify it, you must
duplicate. So NAMED==2 on z<- 1 is basically to prevent you from
accidentally "changing the value of 1". If it weren't, then you could get
bitten by code like for(i in 1:2) {z<- 1; if(i==1) z[1]<- 2}.
If you're assigning the result of a computation, then the object only
exists once, so
z<- 0+1 gets NAMED==1.
However, if the computation is done by returning a named value from within
a function, as in
f<- function(){v<- 1+0; v}
z<- f()
then again NAMED==2. This is because the side effects of the function
_might_ result in something having a hold on the function environment,
e.g. if we had
e<- NULL
f<- function(){e<<-environment(); v<- 1+0; v}
z<- f()
then z[1]<- 5 would change e$v too. As it happens, there aren't any side
effects in the forme case, but R loses track and assumes the worst.
Thanks a lot, think I follow. That explains x vs y, but why is z NAMED==2? The result of data.frame() is an object that exists once (similar to 1:10) so shouldn't it be NAMED==1 too? Or, R loses track and assumes the worst even on its own functions such as data.frame()?
R has several types of functions -- see the R Internals manual for
details. data.frame() is a plain R function, so it is treated no
differently than any user-written function. On the other hand, the
internal function that implements the : operator is a "primitive", so it
has complete control over its return value, and it can set NAMED in the
most efficient way.
So you might think that returning a value as an evaluation of a
primitive adds efficiency, e.g. in Peter's example
f<- function(){v<- 1+0; v + 0}
will return NAMED == 1. But that's because internally it had to make a
copy of v before adding 0 to it, so you've probably really made it less
efficient: the original version might never modify the result, so it
might never make a copy.
Duncan Murdoch