Skip to content

finding global variables in a function containing formulae

6 messages · Hafen, Ryan P, Bert Gunter, Gabor Grothendieck +1 more

#
I need to find all global variables being used in a function and findGlobals() in the codetools package works quite nicely.  However, I am not able to find variables that are used in formulae.  Simply avoiding formulae in functions is not an option because I do not have control over what functions this will be applied to.

Here is an example to illustrate:

library(codetools)

xGlobal <- rnorm(10)
yGlobal <- rnorm(10)

plotFn1 <- function() {
   plot(yGlobal ~ xGlobal)
}

plotFn2 <- function() {
   y <- yGlobal
   x <- xGlobal
   plot(y ~ x)
}

plotFn3 <- function() {
   plot(xGlobal, yGlobal)
}

findGlobals(plotFn1, merge=FALSE)$variables
# character(0)
findGlobals(plotFn2, merge=FALSE)$variables
# [1] "xGlobal" "yGlobal"
findGlobals(plotFn3, merge=FALSE)$variables
# [1] "xGlobal" "yGlobal"

I would like to find that plotFn1 also uses globals xGlobal and yGlobal.  Any suggestions on how I might do this?
#
Does

?all.vars
##as in
[1] "y" "x"

help?

-- Bert
On Thu, Nov 1, 2012 at 11:04 AM, Hafen, Ryan P <Ryan.Hafen at pnnl.gov> wrote:

  
    
1 day later
#
Thanks.  That works if I a have the formula expression handy.  But suppose
I want a function, findGlobalVars() that takes a function as an argument
and finds globals in it, where I have absolutely no idea what is in the
supplied function:

findGlobalVars <- function(f) {
   require(codetools)
   findGlobals(f, merge=FALSE)$variables
}


findGlobalVars(plotFn1)

I would like findGlobalVars() to be able to find variables in formulae
that might be present in f.
On 11/1/12 1:19 PM, "Bert Gunter" <gunter.berton at gene.com> wrote:

            
#
On Thu, Nov 1, 2012 at 2:04 PM, Hafen, Ryan P <Ryan.Hafen at pnnl.gov> wrote:
If this is only being applied to your own functions then we can have a
convention when writing them to help it in which we "declare" such
variables so that findGlobals can locate them:


plotFn1 <- function() {
   xGlobal; yGlobal
   plot(yGlobal ~ xGlobal)
}

findGlobals(plotFn1)
#
findGlobals must be explicitly ignoring calls to the ~ function.
You could poke through the source code of codetools and find
where this is happening.

Or, if you have the source code for the package you are investigating,
use sed to change all "~" to "%TILDE%" and then use findGlobals on
the resulting source code.  The messages will be a bit garbled but
should give you a start.  E.g., compare the following two, in which y
is defined in the function but x is not:
   > findGlobals(function(y)lm(y~x))
  [1] "~"  "lm"
  > findGlobals(function(y)lm(y %TILDE% x))
  [1] "lm"      "%TILDE%" "x"

You will get false alarms, since in a call like lm(y~x+z, data=dat) findGlobals
cannot know if dat includes columns called 'x', 'y', and 'z' and the above
approach errs on the side of reporting the potential problem.

You could use code in codetools to analyze S code instead of source code
to globally replace all calls to "~" with calls to "%TILDE%" but that is more
work than using sed on the source code. 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
I looked through some old notes and found you could
disable the special handler for "~" by removing it from
the environment codetools:::collectUsageHandlers:
  > findGlobals(function(y)lm(y~x)) # doesn't note 'x' as a global reference
  [1] "~"  "lm"
  > tildeHandler <- codetools:::collectUsageHandlers[["~"]]
  > remove("~", envir=codetools:::collectUsageHandlers)
  > findGlobals(function(y)lm(y~x)) # notes 'x'
  [1] "~"  "lm" "x"
  > # reinstall "~" handler to get original behavior
  > # or detach("package:codetools", unload=TRUE) and reattach
  > assign("~", tildeHandler, envir=codetools:::collectUsageHandlers)
  > findGlobals(function(y)lm(y~x)) # does not note 'x'
  [1] "~"  "lm"

You still have the false alarm problem.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com