eval(parse(text vs. get when accessing a function
Hi Martin,
On 1/6/07, Martin Morgan <mtmorgan at fhcrc.org> wrote:
Hi Ramon,
It seems like a naming convention (f.xxx) and eval(parse(...)) are
standing in for objects (of class 'GeneSelector', say, representing a
function with a particular form and doing a particular operation) and
dispatch (a function 'geneConverter' might handle a converter of class
'GeneSelector' one way, user supplied ad-hoc functions more carefully;
inside geneConverter the only real concern is that the converter
argument is in fact a callable function).
eval(parse(...)) brings scoping rules to the fore as an explicit
programming concern; here scope is implicit, but that's probably better
-- R will get its own rules right.
Martin
Here's an S4 sketch:
setClass("GeneSelector",
contains="function",
representation=representation(description="character"),
validity=function(object) {
msg <- NULL
argNames <- names(formals(object))
if (argNames[1]!="x")
msg <- c(msg, "\n GeneSelector requires a first argument named 'x'")
if (!"..." %in% argNames)
msg <- c(msg, "\n GeneSelector requires '...' in its signature")
if (0==length(object at description))
msg <- c(msg, "\n Please describe your GeneSelector")
if (is.null(msg)) TRUE else msg
})
setGeneric("geneConverter",
function(converter, x, ...) standardGeneric("geneConverter"),
signature=c("converter"))
setMethod("geneConverter",
signature(converter="GeneSelector"),
function(converter, x, ...) {
## important stuff here
converter(x, ...)
})
setMethod("geneConverter",
signature(converter="function"),
function(converter, x, ...) {
message("ad-hoc converter; hope it works!")
converter(x, ...)
})
and then...
c1 <- new("GeneSelector",
+ function(x, ...) prod(x, ...), + description="Product of x")
c2 <- new("GeneSelector",
+ function(x, ...) sum(x, ...), + description="Sum of x")
geneConverter(c1, 1:4)
[1] 24
geneConverter(c2, 1:4)
[1] 10
geneConverter(mean, 1:4)
ad-hoc converter; hope it works! [1] 2.5
cvterr <- new("GeneSelector", function(y) {})
Error in validObject(.Object) : invalid class "GeneSelector" object: 1: GeneSelector requires a first argument named 'x' invalid class "GeneSelector" object: 2: GeneSelector requires '...' in its signature invalid class "GeneSelector" object: 3: Please describe your GeneSelector
xxx <- 10 geneConverter(xxx, 1:4)
Error in function (classes, fdef, mtable) :
unable to find an inherited method for function "geneConverter", for signature "numeric"
Thanks!! That is actually a rather interesting alternative approach and I can see it also adds a lot of structure to the problem. I have to confess, though, that I am not a fan of OOP (nor of S4 classes); in this case, in particular, it seems there is a lot of scaffolding in the code above (the counterpoint to the structure?) and, regarding scoping rules, I prefer to think about them explicitly (I find it much simpler than inheritance). Best, R.
"Ramon Diaz-Uriarte" <rdiaz02 at gmail.com> writes:
Dear Greg, On 1/5/07, Greg Snow <Greg.Snow at intermountainmail.org> wrote:
Ramon, I prefer to use the list method for this type of thing, here are a couple of reasons why (maybe you are more organized than me and would never do some of the stupid things that I have, so these don't apply to you, but you can see that the general suggestion applys to some of the rest of us).
Those suggestions do apply to me of course (no claim to being
organized nor beyond idiocy here). And actually the suggestions on
this thread are being very useful. I think, though, that I was not
very clear on the context and my examples were too dumbed down. So
I'll try to give more detail (nothing here is secret, I am just trying
not to bore people).
The code is part of a web-based application, so there is no
interactive user. The R code is passed the arguments (and optional
user functions) from the CGI.
There is one "core" function (call it cvFunct) that, among other
things, does cross-validation. So this is one way to do things:
cvFunct <- function(whatever, genefiltertype, whateverelse) {
internalGeneSelect <- eval(parse(text = paste("geneSelect",
genefiltertype, sep = ".")))
## do things calling internalGeneSelect,
}
and now define all possible functions as
geneSelect.Fratio <- function(x, y, z) {##something}
geneSelect.Wilcoxon <- function(x, y, z) {## something else}
If I want more geneSelect functions, adding them is simple. And I can
even allow the user to pass her/his own functions, with the only
restriction that it takes three args, x, y, z, and that the function
is to be called: "geneSelect." and a user choosen string. (Yes, I need
to make sure no calls to "system", etc, are in the user code, etc,
etc, but that is another issue).
The general idea is not new of course. For instance, in package
"e1071", a somewhat similar thing is done in function "tune", and
David Meyer there uses "do.call". However, tune is a lot more general
than what I had in mind. For instance, "tune" deals with arbitrary
functions, with arbitrary numbers and names of parameters, whereas my
functions above all take only three arguments (x: a matrix, y: a
vector; z: an integer), so the neat functionality provided by
"do.call", and passing the args as a list is not really needed.
So, given that my situation is so structured, and I do not need
"do.call", I think the approach via eval(parse(paste makes my life
simple:
a) the central function (cvFunct) uses something I can easily
recognize: "internalGeneSelect"
b) after the initial eval(parse(text I do not need to worry anymore
about what the "true" gene selection function is called
c) adding new functions and calling them is simple: function naming
follows a simple pattern ("geneSelect." + postfix) and calling the
user function only requires passing the postfix to cvFunct.
d) notice also that, at least the functs. I define, will of course not
be named "f.1", etc, but rather things like "geneSelect.Fratio" or
"geneSelect.namesThatStartWithCuteLetters";
I hope this makes things more clear. I did not include this detail
because this is probably boring (I guess most of you have stopped
reading by now :-).
Using the list forces you to think about what functions may be called and thinking about things before doing them is usually a good idea. Personally I don't trust the user of my functions (usually my future self who has forgotten something that seemed obvious at the time) to not do something stupid with them. With list elements you can have names for the functions and access them either by the name or by a number, I find that a lot easier when I go back to edit/update than to remember which function f.1 or f.2 did what.
But I don't see how having your functions as list elements is easier (specially if the function is longer than 2 to 3 lines) than having all functions systematically named things such as: geneSelect.Fratio geneSelect.Random geneSelect.LetterA etc Of course, I could have a list with the components named "Fratio" "Random", "LetterA". But I fail to see what it adds. And it forces me to build the list, and probably rebuild it whe (or not build it until) the user enters her/his own selection function. But the later I do not need to do with the scheme above.
With your function, what if the user runs:
g(5,3)
What should it do? (you have only shown definitions for f.1 and f.2). With my luck I would accidentily type that and just happen to have a f.3 function sitting around from a previous project that does something that I really don't want it to do now. If I use the list approach then I will get a subscript out of bounds error rather than running something unintended.
I see the general concern, but not how it applies here. If I pass argument "Fratio" then either I use geneSelect.Fratio or I get an error if "geneSelect.Fratio" does not exist. Similar to what would happen if I do g1(2, 8) when f.8 is not defined: Error in eval(expr, envir, enclos) : object "f.8" not found So even in more general cases, except for function redefinitions, etc, you are not able to call non-existent stuff.
2nd, If I used the eval-parse approach then I would probably at some point redefine f.1 or f.2 to the output of a regression analysis or something, then go back and run the g function at a later time and wonder why I am getting an error, then once I have finally figured it out, now I need to remember what f.1 did and rewrite it again. I am much less likely to accidentally replace an element of a list, and if the list is well named I am unlikely to replace the whole list by accident.
Yes, that is true. Again, it does not apply to the actual case I have in mind, but of course, without the detailed info on context I just gave, you could not know that.
3rd, If I ever want to use this code somewhere else (new version of R, on the laptop, give to coworker, ...), it is a lot easier to save and load a single list than to try to think of all the functions that need to be saved.
Oh, sure. But all the functions above live in a single file (actually, a minipackage) except for the optional use function (which is read from a file).
Personally I have never regretted trying not to underestimate my own future stupidity.
Neither do I. And actually, that is why I asked: if Thomas Lumley said, in the fortune, that I better rethink about it, then I should try rethinking about it. But I asked because I failed to see what the problem is.
Hope this helps,
It certainly does. Best, R.
-- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at intermountainmail.org (801) 408-8111
-----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Ramon Diaz-Uriarte Sent: Friday, January 05, 2007 11:41 AM To: Peter Dalgaard Cc: r-help; rdiaz02 at gmail.com Subject: Re: [R] eval(parse(text vs. get when accessing a function On Friday 05 January 2007 19:21, Peter Dalgaard wrote:
Ramon Diaz-Uriarte wrote:
Dear All, I've read Thomas Lumley's fortune "If the answer is parse() you should usually rethink the question.". But I am not sure it that also applies (and why) to other situations (Lumley's comment http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html was in reply to accessing a list). Suppose I have similarly called functions, except for a
postfix. E.g.
f.1 <- function(x) {x + 1}
f.2 <- function(x) {x + 2}
And sometimes I want to call f.1 and some other times f.2 inside
another function. I can either do:
g <- function(x, fpost) {
calledf <- eval(parse(text = paste("f.", fpost, sep = "")))
calledf(x)
## do more stuff
}
Or:
h <- function(x, fpost) {
calledf <- get(paste("f.", fpost, sep = ""))
calledf(x)
## do more stuff
}
Two questions:
1) Why is the second better?
2) By changing g or h I could use "do.call" instead; why
would that
be better? Because I can handle differences in argument lists?
Dear Peter, Thanks for your answer.
Who says that they are better? If the question is how to call a
function specified by half of its name, the answer could well be to
use parse(), the point is that you should rethink whether that was
really the right question.
Why not instead, e.g.
f <- list("1"=function(x) {x + 1} , "2"=function(x) {x + 2}) h <-
function(x, fpost) f[[fpost]](x)
h(2,"2")
[1] 4
h(2,"1")
[1] 3
I see, this is direct way of dealing with the problem. However, you first need to build the f list, and you might not know about that ahead of time. For instance, if I build a function so that the only thing that you need to do to use my function g is to call your function "f.something", and then pass the "something". I am still under the impression that, given your answer, using "eval(parse(text" is not your preferred way. What are the possible problems (if there are any, that is). I guess I am puzzled by "rethink whether that was really the right question". Thanks, R.
Thanks, R.
-- Ram?n D?az-Uriarte Centro Nacional de Investigaciones Oncol?gicas (CNIO) (Spanish National Cancer Center) Melchor Fern?ndez Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) **NOTA DE CONFIDENCIALIDAD** Este correo electr?nico, y en s...{{dropped}}
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Ramon Diaz-Uriarte Statistical Computing Team Structural Biology and Biocomputing Programme Spanish National Cancer Centre (CNIO) http://ligarto.org/rdiaz
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Martin T. Morgan Bioconductor / Computational Biology http://bioconductor.org
Ramon Diaz-Uriarte Statistical Computing Team Structural Biology and Biocomputing Programme Spanish National Cancer Centre (CNIO) http://ligarto.org/rdiaz