Skip to content
Prev 38782 / 63424 Next

scoping/non-standard evaluation issue

Dear r-devel list members,

On a couple of occasions I've encountered the issue illustrated by the
following examples:

--------- snip -----------
+         Armed.Forces + Population + Year, data=longley)
[1] TRUE
+     subs <- 1:10
+     update(mod, subset=subs)
+     }
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + 
    Population + Year, data = longley, subset = subs)

Coefficients:
 (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces  
   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03  
  Population          Year  
   1.164e+00    -1.911e+00
Error in eval(expr, envir, enclos) : object 'subs' not found

--------- snip -----------

I *almost* understand what's going -- that is, clearly mod.1 and mod.2, or
the formulas therein, are associated with different environments, but I
don't quite see why.

Anyway, here are two "solutions" that work, but neither is in my view
desirable:

--------- snip -----------
+     assign(".subs", 1:10, envir=.GlobalEnv)
+     on.exit(remove(".subs", envir=.GlobalEnv))
+     update(mod, subset=.subs)
+     }
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + 
    Population + Year, data = longley, subset = .subs)

Coefficients:
 (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces  
   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03  
  Population          Year  
   1.164e+00    -1.911e+00
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + 
    Population + Year, data = longley, subset = .subs)

Coefficients:
 (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces  
   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03  
  Population          Year  
   1.164e+00    -1.911e+00
+     env <- new.env(parent=.GlobalEnv)
+     attach(NULL)
+     on.exit(detach())
+     assign(".subs", 1:10, pos=2)
+     update(mod, subset=.subs)
+     }
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + 
    Population + Year, data = longley, subset = .subs)

Coefficients:
 (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces  
   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03  
  Population          Year  
   1.164e+00    -1.911e+00
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + 
    Population + Year, data = longley, subset = .subs)

Coefficients:
 (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces  
   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03  
  Population          Year  
   1.164e+00    -1.911e+00  

--------- snip -----------

The problem with f1() is that it will clobber a variable named .subs in the
global environment; the problem with f2() is that .subs can be masked by a
variable in the global environment.

Is there a better approach?

Thanks,
 John

--------------------------------
John Fox
Senator William McMaster 
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox