eval(parse(text vs. get when accessing a function

Hi Martin,
Hi Ramon,

It seems like a naming convention (f.xxx) and eval(parse(...)) are
standing in for objects (of class 'GeneSelector', say, representing a
function with a particular form and doing a particular operation) and
dispatch (a function 'geneConverter' might handle a converter of class
'GeneSelector' one way, user supplied ad-hoc functions more carefully;
inside geneConverter the only real concern is that the converter
argument is in fact a callable function).

eval(parse(...)) brings scoping rules to the fore as an explicit
programming concern; here scope is implicit, but that's probably better
-- R will get its own rules right.

Martin

Here's an S4 sketch:

setClass("GeneSelector",
         contains="function",
         representation=representation(description="character"),
         validity=function(object) {
             msg <- NULL
             argNames <- names(formals(object))
             if (argNames[1]!="x")
               msg <- c(msg, "\n  GeneSelector requires a first argument named 'x'")
             if (!"..." %in% argNames)
               msg <- c(msg, "\n  GeneSelector requires '...' in its signature")
             if (0==length(object at description))
               msg <- c(msg, "\n  Please describe your GeneSelector")
             if (is.null(msg)) TRUE else msg
         })

setGeneric("geneConverter",
           function(converter, x, ...) standardGeneric("geneConverter"),
           signature=c("converter"))

setMethod("geneConverter",
          signature(converter="GeneSelector"),
          function(converter, x, ...) {
              ## important stuff here
              converter(x, ...)
          })

setMethod("geneConverter",
          signature(converter="function"),
          function(converter, x, ...) {
              message("ad-hoc converter; hope it works!")
              converter(x, ...)
          })

and then...

c1 <- new("GeneSelector",
+           function(x, ...) prod(x, ...),
+           description="Product of x")
c2 <- new("GeneSelector",
+           function(x, ...) sum(x, ...),
+           description="Sum of x")
geneConverter(c1, 1:4)
[1] 24
geneConverter(c2, 1:4)
[1] 10
geneConverter(mean, 1:4)
ad-hoc converter; hope it works!
[1] 2.5
cvterr <- new("GeneSelector", function(y) {})
Error in validObject(.Object) : invalid class "GeneSelector" object: 1:
  GeneSelector requires a first argument named 'x'
invalid class "GeneSelector" object: 2:
  GeneSelector requires '...' in its signature
invalid class "GeneSelector" object: 3:
  Please describe your GeneSelector
xxx <- 10
geneConverter(xxx, 1:4)
Error in function (classes, fdef, mtable)  :
        unable to find an inherited method for function "geneConverter", for signature "numeric"

Thanks!! That is actually a rather interesting alternative approach
and I can see it also adds a lot of structure to the problem. I have
to confess, though, that I am not a fan of OOP (nor of S4 classes); in
this case, in particular, it seems there is a lot of scaffolding in
the code above (the counterpoint to the structure?) and, regarding
scoping rules, I prefer to think about them explicitly (I find it much
simpler than inheritance).

Best,

R.
"Ramon Diaz-Uriarte" <rdiaz02 at gmail.com> writes:

Dear Greg,

On 1/5/07, Greg Snow <Greg.Snow at intermountainmail.org> wrote:
Ramon,

I prefer to use the list method for this type of thing, here are a couple of reasons why (maybe you are more organized than me and would never do some of the stupid things that I have, so these don't apply to you, but you can see that the general suggestion applys to some of the rest of us).

Those suggestions do apply to me of course (no claim to being
organized nor beyond idiocy here). And actually the suggestions on
this thread are being very useful. I think, though, that I was not
very clear on the context and my examples were too dumbed down. So
I'll try to give more detail (nothing here is secret, I am just trying
not to bore people).

The code is part of a web-based application, so there is no
interactive user. The R code is passed the arguments (and optional
user functions) from the CGI.

There is one "core" function (call it cvFunct) that, among other
things, does cross-validation. So this is one way to do things:

cvFunct <- function(whatever, genefiltertype, whateverelse) {
      internalGeneSelect <- eval(parse(text = paste("geneSelect",
                                             genefiltertype, sep = ".")))

      ## do things calling internalGeneSelect,
}

and now define all possible functions as

geneSelect.Fratio <- function(x, y, z) {##something}
geneSelect.Wilcoxon <- function(x, y, z) {## something else}

If I want more geneSelect functions, adding them is simple. And I can
even allow the user to pass her/his own functions, with the only
restriction that it takes three args, x, y, z, and that the function
is to be called: "geneSelect." and a user choosen string. (Yes, I need
to make sure no calls to "system", etc, are in the user code, etc,
etc, but that is another issue).

The general idea is not new of course. For instance, in package
"e1071", a somewhat similar thing is done in function "tune", and
David Meyer there uses "do.call". However, tune is a lot more general
than what I had in mind. For instance, "tune" deals with arbitrary
functions, with arbitrary numbers and names of parameters, whereas my
functions above all take only three arguments (x: a matrix, y: a
vector; z: an integer), so the neat functionality provided by
"do.call", and passing the args as a list is not really needed.

So, given that my situation is so structured, and I do not need
"do.call", I think the approach via eval(parse(paste makes my life
simple:

a) the central function (cvFunct) uses something I can easily
recognize: "internalGeneSelect"

b) after the initial eval(parse(text I do not need to worry anymore
about what the "true" gene selection function is called

c) adding new functions and calling them is simple: function naming
follows a simple pattern ("geneSelect." + postfix) and calling the
user function only requires passing the postfix to cvFunct.

d) notice also that, at least the functs. I define, will of course not
be named "f.1", etc, but rather things like "geneSelect.Fratio" or
"geneSelect.namesThatStartWithCuteLetters";

I hope this makes things more clear. I did not include this detail
because this is probably boring (I guess most of you have stopped
reading by now :-).

Using the list forces you to think about what functions may be called and thinking about things before doing them is usually a good idea.  Personally I don't trust the user of my functions (usually my future self who has forgotten something that seemed obvious at the time) to not do something stupid with them.

With list elements you can have names for the functions and access them either by the name or by a number, I find that a lot easier when I go back to edit/update than to remember which function f.1 or f.2 did what.

But I don't see how having your functions as list elements is easier
(specially if the function is longer than 2 to 3 lines) than having
all functions systematically named things such as:

geneSelect.Fratio
geneSelect.Random
geneSelect.LetterA
etc

Of course, I could have a list with the components named "Fratio"
"Random", "LetterA". But I fail to see what it adds. And it forces me
to build the list, and probably rebuild it whe (or not build it until)
the user enters her/his own selection function. But the later I do not
need to do with the scheme above.

With your function, what if the user runs:

g(5,3)
What should it do?  (you have only shown definitions for f.1 and f.2).  With my luck I would accidentily type that and just happen to have a f.3 function sitting around from a previous project that does something that I really don't want it to do now.  If I use the list approach then I will get a subscript out of bounds error rather than running something unintended.

I see the general concern, but not how it applies here. If I pass
argument "Fratio" then either I use geneSelect.Fratio or I get an
error if "geneSelect.Fratio" does not exist. Similar to what would
happen if I do

g1(2, 8)

when f.8 is not defined:

Error in eval(expr, envir, enclos) : object "f.8" not found
So even in more general cases, except for function redefinitions, etc,
you are not able to call non-existent stuff.

2nd, If I used the eval-parse approach then I would probably at some point redefine f.1 or f.2 to the output of a regression analysis or something, then go back and run the g function at a later time and wonder why I am getting an error, then once I have finally figured it out, now I need to remember what f.1 did and rewrite it again.  I am much less likely to accidentally replace an element of a list, and if the list is well named I am unlikely to replace the whole list by accident.

Yes, that is true. Again, it does not apply to the actual case I have
in mind, but of course, without the detailed info on context I just
gave, you could not know that.

3rd, If I ever want to use this code somewhere else (new version of R, on the laptop, give to coworker, ...), it is a lot easier to save and load a single list than to try to think of all the functions that need to be saved.

Oh, sure. But all the functions above live in a single file (actually,
a minipackage) except for the optional use function (which is read
from a file).

Personally I have never regretted trying not to underestimate my own future stupidity.

Neither do I. And actually, that is why I asked: if Thomas Lumley
said, in the fortune, that I better rethink about it, then I should
try rethinking about it. But I asked because I failed to see what the
problem is.

Hope this helps,

It certainly does.

Best,

R.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Ramon
Diaz-Uriarte
Sent: Friday, January 05, 2007 11:41 AM
To: Peter Dalgaard
Cc: r-help; rdiaz02 at gmail.com
Subject: Re: [R] eval(parse(text vs. get when accessing a function

On Friday 05 January 2007 19:21, Peter Dalgaard wrote:
Ramon Diaz-Uriarte wrote:
Dear All,

I've read Thomas Lumley's fortune "If the answer is parse() you
should usually rethink the question.". But I am not sure it that
also applies (and why) to other situations (Lumley's comment
http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html
was in reply to accessing a list).

Suppose I have similarly called functions, except for a
postfix. E.g.
f.1 <- function(x) {x + 1}
f.2 <- function(x) {x + 2}

And sometimes I want to call f.1 and some other times f.2 inside
another function. I can either do:

g <- function(x, fpost) {
    calledf <- eval(parse(text = paste("f.", fpost, sep = "")))
    calledf(x)
    ## do more stuff
}

Or:

h <- function(x, fpost) {
    calledf <- get(paste("f.", fpost, sep = ""))
    calledf(x)
    ## do more stuff
}

Two questions:
1) Why is the second better?

2) By changing g or h I could use "do.call" instead; why
would that
be better? Because I can handle differences in argument lists?
Dear Peter,

Thanks for your answer.

Who says that they are better?  If the question is how to call a
function specified by half of its name, the answer could well be to
use parse(), the point is that you should rethink whether that was
really the right question.

Why not instead, e.g.

f <- list("1"=function(x) {x + 1} , "2"=function(x) {x + 2}) h <-
function(x, fpost) f[[fpost]](x)

h(2,"2")
[1] 4

h(2,"1")
[1] 3

I see, this is direct way of dealing with the problem.
However, you first need to build the f list, and you might
not know about that ahead of time. For instance, if I build a
function so that the only thing that you need to do to use my
function g is to call your function "f.something", and then
pass the "something".

I am still under the impression that, given your answer,
using "eval(parse(text" is not your preferred way.  What are
the possible problems (if there are any, that is). I guess I
am puzzled by "rethink whether that was really the right question".

Thanks,

R.

Thanks,

R.
--
Ram?n D?az-Uriarte
Centro Nacional de Investigaciones Oncol?gicas (CNIO)
(Spanish National Cancer Center) Melchor Fern?ndez Almagro, 3
28029 Madrid (Spain)
Fax: +-34-91-224-6972
Phone: +-34-91-224-6900

http://ligarto.org/rdiaz
PGP KeyID: 0xE89B3462
(http://ligarto.org/rdiaz/0xE89B3462.asc)

**NOTA DE CONFIDENCIALIDAD** Este correo electr?nico, y en
s...{{dropped}}

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Martin T. Morgan
Bioconductor / Computational Biology
http://bioconductor.org

Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

eval(parse(text vs. get when accessing a function

Thread (11 messages)