Hi,
I've run into a problem calling the step function from within a function; I sent this to the R development list first, but the moderator said it was better suited to R help. My OS is Windows 7 and I'm using R version 3.2.3.
Here's a simple function to help reproduce the error:
> test.FN
function(dfr, scope, k=2){
temp.lm=lm(scope$lower, data=dfr)
step(temp.lm, scope=scope$upper, k=k)
}
And here's the code that gives the error when calling the function above:
# Begin by setting the rng seed.
> set.seed(523)
# Generate a design matrix and response.
> X.des=matrix(abs(rnorm(50*20, sd=4)), nrow=50)
> Y=20 + X.des[, 1:3] %*% matrix(c(3, -4, 2), nrow=3) + rnorm(50)
> X.des=cbind(as.data.frame(X.des), Y)
# Create the lower and upper formula components of a list.
> test.scope=list(lower=as.formula(Y ~ 1), upper=as.formula(paste("Y ~ ", paste(names(X.des)[1:20], collapse=" + "), sep="")))
# Run 'test.FN'.
> test.FN(dfr=X.des, scope=test.scope)
Start: AIC=257.58
Y ~ 1
Error in is.data.frame(data) : object 'dfr' not found
> traceback()
11: is.data.frame(data)
10: model.frame.default(formula = Y ~ V1 + V2 + V3 + V4 + V5 + V6 +
V7 + V8 + V9 + V10 + V11 + V12 + V13 + V14 + V15 + V16 +
V17 + V18 + V19 + V20, data = dfr, drop.unused.levels = TRUE)
9: stats::model.frame(formula = Y ~ V1 + V2 + V3 + V4 + V5 + V6 +
V7 + V8 + V9 + V10 + V11 + V12 + V13 + V14 + V15 + V16 +
V17 + V18 + V19 + V20, data = dfr, drop.unused.levels = TRUE)
8: eval(expr, envir, enclos)
7: eval(fcall, env)
6: model.frame.lm(fob, xlev = object$xlevels)
5: model.frame(fob, xlev = object$xlevels)
4: add1.lm(fit, scope$add, scale = scale, trace = trace, k = k,
...)
3: add1(fit, scope$add, scale = scale, trace = trace, k = k, ...)
2: step(temp.lm, scope = scope$upper, k = k) at #3
1: test.FN(dfr = X.des, scope = test.scope)
The call to the traceback function indicates add1 doesn't see the dataframe dfr passed to test.FN. The step function runs fine when I do everything in the global environment without using test.FN. I know the lexical scoping rules are different for objects involving model formulae, but despite a fair amount of experimentation, I haven't found any way to make the step / add1 functions see the dataframe that's passed to test.FN. Any help would be greatly appreciated.
Thanks,
Paul Louisell
Statistical Specialist
Paul.Louisell at pw.utc.com<mailto:Paul.Louisell at pw.utc.com>
860-565-8104
Still, tomorrow's going to be another working day, and I'm trying to get some rest.
That's all, I'm trying to get some rest.
Paul Simon, "American Tune"
Lexical scoping for step and add1 functions
4 messages · Louisell, Paul T PW, Jeff Newmiller, S Ellison
1 day later
In a nutshell, formulas carry the environment in which they are
defined along with the variable names, and your dfr was defined in the
test.FN environment, but the formulas were defined in the global
environment. I got this to work by defining the formula character strings
in the global environment, and then converting those strings to formulas
in the function. I don't think you can trick lm into referring to the
global environment from within test.FN so that summaries refer to the
X.des data frame instead of dfr (but someone could prove me wrong).
################################
test.FN <- function( dfr, scope, k = 2 ) {
scp <- list( lower = as.formula( scope$lower )
, upper = as.formula( scope$upper )
)
temp.lm <- lm( scp$lower
, data = dfr
)
step( temp.lm
, scope = scp
, k=k
)
}
# Begin by setting the rng seed.
set.seed( 523 )
# Generate a design matrix and response.
X.des <- matrix( abs( rnorm( 50 * 20, sd = 4 ) ), nrow = 50 )
Y <- 20 + X.des[, 1:3 ] %*% matrix( c( 3, -4, 2 ), nrow = 3 ) + rnorm( 50 )
X.des <- cbind( as.data.frame( X.des ), Y )
# Create the lower and upper formula components of a list.
test.scope <- list( lower = "Y ~ 1"
, upper = paste( "Y ~"
, paste( names( X.des )[ 1:20 ]
, collapse = " + "
)
, sep=""
)
)
# Run 'test.FN'.
test.FN( dfr = X.des
, scope = test.scope
)
On Mon, 7 Mar 2016, Louisell, Paul T PW wrote:
Hi, I've run into a problem calling the step function from within a function; I sent this to the R development list first, but the moderator said it was better suited to R help. My OS is Windows 7 and I'm using R version 3.2.3. Here's a simple function to help reproduce the error:
> test.FN
function(dfr, scope, k=2){
temp.lm=lm(scope$lower, data=dfr)
step(temp.lm, scope=scope$upper, k=k)
}
And here's the code that gives the error when calling the function above:
# Begin by setting the rng seed.
> set.seed(523)
# Generate a design matrix and response.
> X.des=matrix(abs(rnorm(50*20, sd=4)), nrow=50)
> Y=20 + X.des[, 1:3] %*% matrix(c(3, -4, 2), nrow=3) + rnorm(50)
> X.des=cbind(as.data.frame(X.des), Y)
# Create the lower and upper formula components of a list.
> test.scope=list(lower=as.formula(Y ~ 1), upper=as.formula(paste("Y ~ ", paste(names(X.des)[1:20], collapse=" + "), sep="")))
# Run 'test.FN'.
> test.FN(dfr=X.des, scope=test.scope)
Start: AIC=257.58
Y ~ 1
Error in is.data.frame(data) : object 'dfr' not found
> traceback()
11: is.data.frame(data)
10: model.frame.default(formula = Y ~ V1 + V2 + V3 + V4 + V5 + V6 +
V7 + V8 + V9 + V10 + V11 + V12 + V13 + V14 + V15 + V16 +
V17 + V18 + V19 + V20, data = dfr, drop.unused.levels = TRUE)
9: stats::model.frame(formula = Y ~ V1 + V2 + V3 + V4 + V5 + V6 +
V7 + V8 + V9 + V10 + V11 + V12 + V13 + V14 + V15 + V16 +
V17 + V18 + V19 + V20, data = dfr, drop.unused.levels = TRUE)
8: eval(expr, envir, enclos)
7: eval(fcall, env)
6: model.frame.lm(fob, xlev = object$xlevels)
5: model.frame(fob, xlev = object$xlevels)
4: add1.lm(fit, scope$add, scale = scale, trace = trace, k = k,
...)
3: add1(fit, scope$add, scale = scale, trace = trace, k = k, ...)
2: step(temp.lm, scope = scope$upper, k = k) at #3
1: test.FN(dfr = X.des, scope = test.scope)
The call to the traceback function indicates add1 doesn't see the dataframe dfr passed to test.FN. The step function runs fine when I do everything in the global environment without using test.FN. I know the lexical scoping rules are different for objects involving model formulae, but despite a fair amount of experimentation, I haven't found any way to make the step / add1 functions see the dataframe that's passed to test.FN. Any help would be greatly appreciated.
Thanks,
Paul Louisell
Statistical Specialist
Paul.Louisell at pw.utc.com<mailto:Paul.Louisell at pw.utc.com>
860-565-8104
Still, tomorrow's going to be another working day, and I'm trying to get some rest.
That's all, I'm trying to get some rest.
Paul Simon, "American Tune"
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
In a nutshell, formulas carry the environment in which they are defined along with the variable names, and your dfr was defined in the test.FN environment, but the formulas were defined in the global environment. I got this to work by defining the formula character strings in the global environment, and then converting those strings to formulas in the function. I don't think you can trick lm into referring to the global environment from within test.FN so that summaries refer to the X.des data frame instead of dfr (but someone could prove me wrong).
If you want a function to refer to something in the global environment, just refer to the global object in the function. If the object name isn't used in the function's scope, it is sought in the parent environment. So the original code works if
test.FN <- function(scope, k=2){
temp.lm=lm(scope$lower, data=X.des) ## X.des is sought in parent environment
step(temp.lm, scope=scope$upper, k=k)
}
Admittedly, I'd not regard that kind of thing as a good idea; fragile and inflexible. But if you are clear about scope it does work.
Another way to proceed, somewhat more safely, is to wrap the whole thing in a function, passing X.des as dfr, then defining test.FN inside the outer function so that you know where it's going to get dfr from. Something along the lines of
test.step <- function(dfr, Y) {
test.FN <- function(scope, k=2){
temp.lm=lm(scope$lower, data=dfr) ## X.des
print(temp.lm)
step(temp.lm, scope=as.formula(scope$upper), k=k)
}
scope <- list( lower= as.formula('Y~1'),
upper=as.formula(paste('Y~', paste(names(dfr[1:20]), collapse="+")))
)
test.FN(scope=scope)
}
test.step(X.des, Y)
S Ellison
*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}
So in both my solution and your second option, lm prints that it evaluated the regression in a function context (using dfr) which the user of the function might prefer to be unaware of (they know what X.des is). Your first solution avoids that but hardcodes access to the global variable so if the user wants to use a different data frame then a different function has to be defined. I am OK with that, but thought that there might be a way to indirectly tell lm to use the global environment via the parameter dfr.
Sent from my phone. Please excuse my brevity.
On March 9, 2016 3:05:28 AM PST, S Ellison <S.Ellison at LGCGroup.com> wrote:
>> In a nutshell, formulas carry the environment in which they are
>defined along
>> with the variable names, and your dfr was defined in the test.FN
>environment,
>> but the formulas were defined in the global environment. I got this
>to work by
>> defining the formula character strings in the global environment, and
>then
>> converting those strings to formulas in the function. I don't think
>you can trick
>> lm into referring to the global environment from within test.FN so
>that
>> summaries refer to the X.des data frame instead of dfr (but someone
>could
>> prove me wrong).
>
>If you want a function to refer to something in the global environment,
>just refer to the global object in the function. If the object name
>isn't used in the function's scope, it is sought in the parent
>environment. So the original code works if
>test.FN <- function(scope, k=2){
>temp.lm=lm(scope$lower, data=X.des) ## X.des is sought in parent
>environment
> step(temp.lm, scope=scope$upper, k=k)
> }
>
>Admittedly, I'd not regard that kind of thing as a good idea; fragile
>and inflexible. But if you are clear about scope it does work.
>
>Another way to proceed, somewhat more safely, is to wrap the whole
>thing in a function, passing X.des as dfr, then defining test.FN inside
>the outer function so that you know where it's going to get dfr from.
>Something along the lines of
>
>test.step <- function(dfr, Y) {
> test.FN <- function(scope, k=2){
> temp.lm=lm(scope$lower, data=dfr) ## X.des
> print(temp.lm)
> step(temp.lm, scope=as.formula(scope$upper), k=k)
> }
> scope <- list( lower= as.formula('Y~1'),
> upper=as.formula(paste('Y~', paste(names(dfr[1:20]), collapse="+")))
> )
> test.FN(scope=scope)
>}
>
>test.step(X.des, Y)
>
>
>S Ellison
>
>
>
>*******************************************************************
>This email and any attachments are confidential. Any
>use...{{dropped:8}}
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]