Skip to content

name scoping within dataframe index

7 messages · Duncan Murdoch, Gabor Grothendieck, Alexy Khrabrov

#
Every time I have to prefix a dataframe column inside the indexing  
brackets with the dataframe name, e.g.

df[df$colname==value,]

-- I am wondering, why isn't there an R scoping rule that search  
starts with the dataframe names, as if we'd said

with(df, df[colname==value,])

-- wouldn't that be a reasonable default to prepend to the name search  
path?

Cheers,
Alexy
#
On 1/26/2009 1:46 PM, Alexy Khrabrov wrote:
If you did that, it would be quite difficult to get at a "colname" 
variable that *isn't* the column of df.  It would be something like

  df[get("colname", parent.frame()) == value,]

So just use subset(), or with(), or type the extra 3 chars.

Duncan
#
Try:

subset(df, colname == value)
On Mon, Jan 26, 2009 at 1:46 PM, Alexy Khrabrov <deliverable at gmail.com> wrote:
#
Actually, what I propose is  a special search rule which simply looks  
at the enclosing dataframe.name[...] outside the brackets and looks up  
the columns first.

It would break legacy code which used the column names identical to  
variables in this context, but there's probably other ideas to enhance  
R readability which would break legacy code.  Perhaps when the next  
major overhaul occurs, this is something folks can voice opinions  
about.  I find the need for inner prefixing quite unnatural, FWIW.

Cheers,
Alexy
#
On 1/26/2009 2:01 PM, Alexy Khrabrov wrote:
Yes, I understood that, and I explained why it would be a bad idea.

Duncan Murdoch
#
On Jan 26, 2009, at 2:12 PM, Duncan Murdoch wrote:
Well this is the case in all programming languages with scoping where  
inner-scope variables override the outer ones.  Usually it's solved  
with prefixing with the outer scope, outercsope.name or  
outerscope::name or so.  So it only underscores the need to improve  
scoping access in R.

Dataframe column names belong to the dataframe object and the natural  
thing would be to enable easy access to naming; you'd need to apply an  
extra effort to access an overridden unrelated external variable.   
Again, just an analogy from other programming languages.

Cheers,
Alexy
#
On 1/26/2009 2:20 PM, Alexy Khrabrov wrote:
The issue is that in most cases the outer scope would be unnamed:  it's 
the one that currently doesn't need a prefix.  So if we have a prefix 
meaning "this scope", why wouldn't that evaluate to "df" in that 
context?  I guess we need a prefix meaning "the caller's scope", but 
that's just going to lead to confusion:  is it the caller of the 
function that is trying to index df, or the function trying to do the 
indexing?  So we'd need a prefix specific to indexing:  and that's just 
too ugly for words.

As I said, use subset() or with().  For subset selection, subset() works 
very nicely.  (I don't like the way it does column selection, but that's 
a different argument.)

Duncan Murdoch