Question about match.fun() - R-devel

Tue, May 9, 2006 12:49 AM #

Dear all,

I was recently contacted by a user about an alledged problem/bug in
the latest version of lasso2.  After some investigation, we found out
that it was a user error which boils down to the following:

Error in get(x, envir, mode, inherits) : variable "fred" of mode "function" was not found

only that the "offending" apply() command happened inside the gl1ce()
function of lasso2.

I was under the impression that R can now distinguish between
variables and functions with the same name and, indeed, the following
works:

[1] 1.053002 1.250875

Poking a bit around, I guess that the ability to distinguish between
variables and functions with the same name comes from the introduction
of the function match.fun() and, after reading its help page, the
reasons why an error is triggered the first time but not the second
time is perfectly clear to me.

I wonder whether it would make sense to change match.fun() such that
the first case does not result in an error?  I was thinking along the
line that if the argument to match.fun() is a variable that contains a
character vector of length one then, using get(), match.fun() attemps
to find a function with that name.  If the get() command does not
succeed, then a second try is made using the name of the variable
passed by the caller to match.fun().

So before trying to modify match.fun() and submitting a patch, I
wanted to ask whether such a change would be accepted?  Is there an
argument that I don't see that the first case should always result in
an error and not be silently resolved?

Cheers,

        Berwin

Brian Ripley

Thu, May 11, 2006 6:18 AM #

On Tue, 9 May 2006, Berwin A Turlach wrote:

No, not really.  It comes in general from the internal C functions knowing 
from the context what they are looking for from the parse context.

This is tricky, and indeed the bit that appears strange to me is that
the second works.  It comes from

r5628 | pd | 1999-08-26 14:31:42 +0100 (Thu, 26 Aug 1999) | 2 lines
match.fun fixes

which is not very informative, but I found e.g.

http://tolstoy.newcastle.edu.au/R/help/99b/0254.html
PD> This also applies (!) to various other places that need to deal
PD> with FUN arguments (apply, sapply, sweep, outer). It might be
PD> preferable to make match.fun smarter, at the expense of making it
PD> completely obscure.

(and I think we succeeded!)  The essence of that example would appear to 
be

xlev <-list(a=1:7, length=NULL)
sapply(xlev, is.null)

which failed long, long ago.

Note that ?apply (and so on) in 2.3.0 only mention the possibility of 
supplying a function or a character string.  Even the latter seems 
unnecessary these days now we have backquotes:

x <- matrix(runif(20), 10, 2)
apply(x, 2, `+`, 7)

I spent some time recently tidying up the family functions in R-devel. 
There the issues are similar but complicated by the fact that in 
binomial(probit) there is no object `probit'.

I've come to the view that we are trying too hard in many of these cases.
So I would like to see arguments why we need to allow more than

         function
         symbol
         length-one character vector.

and I don't see it lessens the confusion to allow the name of a length-one 
character vector to mean either the value of the first element of the 
object or a symbol if the value is not visible as a function.

The main argument is that it is not as documented, and confusing.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595