Thanks,
John
--------------------------------
John Fox
Senator William McMaster
Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
-----Original Message-----
From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org]
Behalf Of Gabor Grothendieck
Sent: January-04-11 6:56 PM
To: John Fox
Cc: Sanford Weisberg; r-devel at r-project.org
Subject: Re: [Rd] scoping/non-standard evaluation issue
On Tue, Jan 4, 2011 at 4:35 PM, John Fox <jfox at mcmaster.ca> wrote:
Dear r-devel list members,
On a couple of occasions I've encountered the issue illustrated by the
following examples:
--------- snip -----------
mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed +
+ Armed.Forces + Population + Year, data=longley)
mod.2 <- update(mod.1, . ~ . - Year + Year)
+ subs <- 1:10
+ update(mod, subset=subs)
+ }
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year, data = longley, subset = subs)
Coefficients:
(Intercept) GNP.deflator GNP Unemployed Armed.Forces
3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
Population Year
1.164e+00 -1.911e+00
Error in eval(expr, envir, enclos) : object 'subs' not found
--------- snip -----------
I *almost* understand what's going -- that is, clearly mod.1 and mod.2,
the formulas therein, are associated with different environments, but I
don't quite see why.
Anyway, here are two "solutions" that work, but neither is in my view
desirable:
--------- snip -----------
+ assign(".subs", 1:10, envir=.GlobalEnv)
+ on.exit(remove(".subs", envir=.GlobalEnv))
+ update(mod, subset=.subs)
+ }
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year, data = longley, subset = .subs)
Coefficients:
(Intercept) GNP.deflator GNP Unemployed Armed.Forces
3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
Population Year
1.164e+00 -1.911e+00
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year, data = longley, subset = .subs)
Coefficients:
(Intercept) GNP.deflator GNP Unemployed Armed.Forces
3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
Population Year
1.164e+00 -1.911e+00
+ env <- new.env(parent=.GlobalEnv)
+ attach(NULL)
+ on.exit(detach())
+ assign(".subs", 1:10, pos=2)
+ update(mod, subset=.subs)
+ }
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year, data = longley, subset = .subs)
Coefficients:
(Intercept) GNP.deflator GNP Unemployed Armed.Forces
3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
Population Year
1.164e+00 -1.911e+00
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year, data = longley, subset = .subs)
Coefficients:
(Intercept) GNP.deflator GNP Unemployed Armed.Forces
3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
Population Year
1.164e+00 -1.911e+00
--------- snip -----------
The problem with f1() is that it will clobber a variable named .subs in
global environment; the problem with f2() is that .subs can be masked by
variable in the global environment.
Is there a better approach?
I think there is something wrong with R here since the formula in the
call component of mod.1 has a "call" class whereas the corresponding
call component of mod.2 has "formula" class:
[1] "formula"
If we reset call[[2]] to have "call" class then it works:
mod.2a <- mod.2
mod.2a$call[[2]] <- as.call(as.list(mod.2a$call[[2]]))
f(mod.2a)
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
Population + Year, data = longley, subset = subs)
Coefficients:
(Intercept) GNP.deflator GNP Unemployed Armed.Forces
Population Year
3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03
1.164e+00 -1.911e+00
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com