An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20081218/68b9a71a/attachment.pl>
understanding lexical scope
9 messages · joseph.g.boyer at gsk.com, Antonio, Fabio Di Narzo, Thomas Lumley +4 more
2008/12/18 <joseph.g.boyer at gsk.com>:
I am trying to understand the concept of lexical scope in "An Introduction
to R" by the R Core development team.
I'd appreciate it if someone would explain why the following example does
not work:
q <- function(y) {x + y}; w <- function(x){q(x)}; w(2);
According to the discussion of Scope on page 46, it seems to me that R
will interpret the free variable x in q as the parameter x in w,
Why? R will look at the enclosing environment, which here is the workspace. Maybe you meant:
w <- function(x){ q <- function(y) x+y; q(x)}; w(2)
which works as you said. HTH, Antonio.
and so
will
give w(2) = 2+2.
Joe Boyer
Statistical Sciences
Renaissance Bldg 510, 3233-D
Mail Stop RN0320
8-275-3661
cell: (610) 209-8531
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Antonio, Fabio Di Narzo Ph.D. student at Department of Statistical Sciences University of Bologna, Italy
On Thu, 18 Dec 2008 joseph.g.boyer at gsk.com wrote:
I am trying to understand the concept of lexical scope in "An Introduction
to R" by the R Core development team.
I'd appreciate it if someone would explain why the following example does
not work:
q <- function(y) {x + y}; w <- function(x){q(x)}; w(2);
According to the discussion of Scope on page 46, it seems to me that R
will interpret the free variable x in q as the parameter x in w, and so
will
give w(2) = 2+2.
No, not at all. The function q() is not defined inside w(), it is defined in the global environment. Inside q(), x is first looked up as a local variable, without success, and then looked up in the environment where q() was defined (the global environment), also without success.
There is an x in the calling environment of q(), ie, inside w(), but finding things in the calling environment is dynamic scope rather than lexical scope.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20081219/c85f60e6/attachment.pl>
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of joseph.g.boyer at gsk.com Sent: Friday, December 19, 2008 7:41 AM To: Thomas Lumley Cc: r-help at r-project.org Subject: Re: [R] understanding lexical scope Thomas, Jeff, Mark, Antonio, Thank you for your answers. They have helped me clarify how R functions work. They work differently from SAS functions (which SAS calls macros.)
Well, SAS macros are not functions in the traditional sense. The SAS macro language for the most part just does text substitution prior to the SAS code being sent to the SAS "compiler"/interpreter. So, your description of rewriting the "function body" in step 1. below, is fairly accurate for SAS macro, but it is not accurate for R. If you try to fit R functions into a SAS macro language mold you will only confuse yourself on both accounts. I will leave the technical details of R functions to the R experts.
If you know SAS, consider the following code:
*********************
%macro q(y);
data one;
outvar = &y. + &x.; output;
call symputx("outvar", outvar, "G");
run;
%mend;
%macro w(x);
%q(&x.);
%put &outvar.;
%mend;
**************
Then %w(2); will result in the value 4 being placed in the SAS log.
To me, while the coding is quite awkward, the execution is
logical. The
variable x has been defined by the call to the macro w, so
there is no
problem when SAS encounters a reference to x in the macro q.
But in the equivalent code in R,
q <- function(y) y +x; w <- function(x) q(x); w(2);
when R can't find the second argument of q in the local
environment of the
macro q, it doesn't look in the local environment of the
macro w, it goes
If you want to try to compare the R language to SAS language (not favorable to SAS for most on this list), the better comparison for understanding is the data step language, not SAS macro.
all the way back to
the global environment, as you have all pointed out.
So in my little model of how R functions work, when a
function is called
1. R rewrites the body of the function, replacing all of the
parameter
names with the values given to them in the function call.
2. R then tries to execute the expressions. But R only
"remembers" the
assignment of values to parameter names during step 1. Thus
in our example
it has to go the global environment to find a value for "x"
referenced in q.
Is this right?
I bet one of the expeRts on the list will provide you with more detail than could have ever hoped for. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204
joseph.g.boyer wrote:
Thomas, Jeff, Mark, Antonio, Thank you for your answers. They have helped me clarify how R functions work. They work differently from SAS functions (which SAS calls macros.) To me, while the coding is quite awkward, the execution is logical. The variable x has been defined by the call to the macro w, so there is no problem when SAS encounters a reference to x in the macro q. ... But in the equivalent code in R, q <- function(y) y +x; w <- function(x) q(x); w(2); when R can't find the second argument of q in the local environment of the macro q, it doesn't look in the local environment of the macro w, it goes all the way back to the global environment, as you have all pointed out.
When you think of it as "all the way back to the global environment", you're introducing confusion. The lexical scoping way of doing it means that you can look at q right now and tell where it's going to look for x: first in q, then in the environment where you are defining it (global in this instance), etc. There's nothing dynamic about where it finds x. It does not matter how you call q or what w -- which might not even exist -- might or might not do. The way you previously preferred depends entirely on how q is called. Imagine you have not just w, but w1, w2, w3, w4, ..., w77, and each one does something different -- many do not have x as parameter -- and several of them call each other before calling q. You cannot begin to tell ahead of time where x will come from, and it would be extremely hard to figure out the order of calls that actually occur to figure out which x you're going to get. In your very simple example where q is always called by w and w never attempts to do anything tricky, it's not hard to see, but in the real world, it would make your head spin. You're still thinking in the dynamic sense when you talk about "all the way back".
View this message in context: http://www.nabble.com/understanding-lexical-scope-tp21084267p21101765.html Sent from the R help mailing list archive at Nabble.com.
Nordlund, Dan (DSHS/RDA) wrote:
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of joseph.g.boyer at gsk.com Sent: Friday, December 19, 2008 7:41 AM To: Thomas Lumley Cc: r-help at r-project.org Subject: Re: [R] understanding lexical scope Thomas, Jeff, Mark, Antonio, Thank you for your answers. They have helped me clarify how R functions work. They work differently from SAS functions (which SAS calls macros.)
Well, SAS macros are not functions in the traditional sense. The SAS
macro language for the most part just does text substitution prior to the SAS code being sent to the SAS "compiler"/interpreter. So, your description of rewriting the "function body" in step 1. below, is fairly accurate for SAS macro, but it is not accurate for R. If you try to fit R functions into a SAS macro language mold you will only confuse yourself on both accounts. I will leave the technical details of R functions to the R experts. [....]
I bet one of the expeRts on the list will provide you with more detail
than could have ever hoped for. Not much, I think. It's one of those cases where you too easily end up rewriting manuals or even books. The text above is quite accurate: Macro-based languages substitute text, structured languages call functions with parameters. And some do a bit of each. And every now and again you wish that the language at hand would do the opposite of what it actually does. One distinction is if you have things like #define f(x) 2*x #define g(y) f(y+2) (in the C language preprocessor syntax), then you end up with g(y) as y+2*2 (i.e., y+4), whereas the corresponding function calls give 2*(y+2). Also, and the flip side of the original question: Macros have difficulties with encapsulation; with a bit of bad luck, arguments given to f() can modify its internal variables. In R there are things that you want to do that are macro-like, and you can generally achieve the same effect with substitute/match.call/eval constructions, but it does get a bit contorted (lines 3-10 of the lm function is required reading if you want to understand these matters). Some of us occasionally ponder whether it would be cleaner to have a real (LISP-style) macro facility, but nothing really convincing has come up this far.
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Peter Dalgaard wrote:
One distinction is if you have things like #define f(x) 2*x #define g(y) f(y+2) (in the C language preprocessor syntax), then you end up with g(y) as y+2*2 (i.e., y+4), whereas the corresponding function calls give 2*(y+2). Also, and the flip side of the original question: Macros have difficulties with encapsulation; with a bit of bad luck, arguments given to f() can modify its internal variables.
using c macros, you end up with g(y) substituted by 2*y+2, rather than y+2*2, as you say (and rather than 2*(y+2), which you'd effectively get using a function). that's why you'd typically include all occurences of all macro 'parameters' in the macro 'body' in parentheses: #define f(x) 2*(x) some consider using c macros as not-so-good practice and favour inline functions. but macros are not always bad; in scheme, for example, you have a hygienic macro system which let's you use the benefits of macros while avoiding some of the risks. vQ
Wacek Kusnierczyk wrote:
Peter Dalgaard wrote:
One distinction is if you have things like #define f(x) 2*x #define g(y) f(y+2) (in the C language preprocessor syntax), then you end up with g(y) as y+2*2 (i.e., y+4), whereas the corresponding function calls give 2*(y+2). Also, and the flip side of the original question: Macros have difficulties with encapsulation; with a bit of bad luck, arguments given to f() can modify its internal variables.
using c macros, you end up with g(y) substituted by 2*y+2, rather than y+2*2, as you say (and rather than 2*(y+2), which you'd effectively get using a function).
Oops. Yes. I suppose I had x*2 there at some point....
that's why you'd typically include all occurences of all macro 'parameters' in the macro 'body' in parentheses: #define f(x) 2*(x) some consider using c macros as not-so-good practice and favour inline functions. but macros are not always bad; in scheme, for example, you have a hygienic macro system which let's you use the benefits of macros while avoiding some of the risks. vQ
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907