scoping/non-standard evaluation issue
Dear Gabor, I used str() to look at the two objects but missed the difference that you found. What I didn't quite understand was why one model worked but not the other when both were defined at the command prompt in the global environment. Thanks, John -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox
-----Original Message----- From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org]
On
Behalf Of Gabor Grothendieck Sent: January-04-11 6:56 PM To: John Fox Cc: Sanford Weisberg; r-devel at r-project.org Subject: Re: [Rd] scoping/non-standard evaluation issue On Tue, Jan 4, 2011 at 4:35 PM, John Fox <jfox at mcmaster.ca> wrote:
Dear r-devel list members, On a couple of occasions I've encountered the issue illustrated by the following examples: --------- snip -----------
mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed +
+ ? ? ? ? Armed.Forces + Population + Year, data=longley)
mod.2 <- update(mod.1, . ~ . - Year + Year)
all.equal(mod.1, mod.2)
[1] TRUE
f <- function(mod){
+ ? ? subs <- 1:10 + ? ? update(mod, subset=subs) + ? ? }
f(mod.1)
Call: lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + ? ?Population + Year, data = longley, subset = subs) Coefficients: ?(Intercept) ?GNP.deflator ? ? ? ? ? GNP ? ?Unemployed ?Armed.Forces ? 3.641e+03 ? ? 8.394e-03 ? ? 6.909e-02 ? ?-3.971e-03 ? ?-8.595e-03 ?Population ? ? ? ? ?Year ? 1.164e+00 ? ?-1.911e+00
f(mod.2)
Error in eval(expr, envir, enclos) : object 'subs' not found --------- snip ----------- I *almost* understand what's going -- that is, clearly mod.1 and mod.2,
or
the formulas therein, are associated with different environments, but I don't quite see why. Anyway, here are two "solutions" that work, but neither is in my view desirable: --------- snip -----------
f1 <- function(mod){
+ ? ? assign(".subs", 1:10, envir=.GlobalEnv)
+ ? ? on.exit(remove(".subs", envir=.GlobalEnv))
+ ? ? update(mod, subset=.subs)
+ ? ? }
f1(mod.1)
Call: lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + ? ?Population + Year, data = longley, subset = .subs) Coefficients: ?(Intercept) ?GNP.deflator ? ? ? ? ? GNP ? ?Unemployed ?Armed.Forces ? 3.641e+03 ? ? 8.394e-03 ? ? 6.909e-02 ? ?-3.971e-03 ? ?-8.595e-03 ?Population ? ? ? ? ?Year ? 1.164e+00 ? ?-1.911e+00
f1(mod.2)
Call: lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + ? ?Population + Year, data = longley, subset = .subs) Coefficients: ?(Intercept) ?GNP.deflator ? ? ? ? ? GNP ? ?Unemployed ?Armed.Forces ? 3.641e+03 ? ? 8.394e-03 ? ? 6.909e-02 ? ?-3.971e-03 ? ?-8.595e-03 ?Population ? ? ? ? ?Year ? 1.164e+00 ? ?-1.911e+00
f2 <- function(mod){
+ ? ? env <- new.env(parent=.GlobalEnv)
+ ? ? attach(NULL)
+ ? ? on.exit(detach())
+ ? ? assign(".subs", 1:10, pos=2)
+ ? ? update(mod, subset=.subs)
+ ? ? }
f2(mod.1)
Call: lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + ? ?Population + Year, data = longley, subset = .subs) Coefficients: ?(Intercept) ?GNP.deflator ? ? ? ? ? GNP ? ?Unemployed ?Armed.Forces ? 3.641e+03 ? ? 8.394e-03 ? ? 6.909e-02 ? ?-3.971e-03 ? ?-8.595e-03 ?Population ? ? ? ? ?Year ? 1.164e+00 ? ?-1.911e+00
f2(mod.2)
Call: lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + ? ?Population + Year, data = longley, subset = .subs) Coefficients: ?(Intercept) ?GNP.deflator ? ? ? ? ? GNP ? ?Unemployed ?Armed.Forces ? 3.641e+03 ? ? 8.394e-03 ? ? 6.909e-02 ? ?-3.971e-03 ? ?-8.595e-03 ?Population ? ? ? ? ?Year ? 1.164e+00 ? ?-1.911e+00 --------- snip ----------- The problem with f1() is that it will clobber a variable named .subs in
the
global environment; the problem with f2() is that .subs can be masked by
a
variable in the global environment. Is there a better approach?
I think there is something wrong with R here since the formula in the call component of mod.1 has a "call" class whereas the corresponding call component of mod.2 has "formula" class:
class(mod.1$call[[2]])
[1] "call"
class(mod.2$call[[2]])
[1] "formula" If we reset call[[2]] to have "call" class then it works:
mod.2a <- mod.2 mod.2a$call[[2]] <- as.call(as.list(mod.2a$call[[2]])) f(mod.2a)
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year, data = longley, subset = subs)
Coefficients:
(Intercept) GNP.deflator GNP Unemployed Armed.Forces
Population Year
3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
1.164e+00 -1.911e+00
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel