Skip to content
Prev 374420 / 398528 Next

The stages of standard function evaluation

First of all, your message is a little hard to read because you posted 
in HTML.  This list removes the HTML, and often mangles messages, so you 
should always post in plain text.  But in this case your message was 
still pretty readable.
On 02/05/2018 11:04 PM, Andrew Hoerner wrote:
This may be mangling, but it's really hard to tell whether the 3 
paragraphs above are supposed to be steps, headings, or what.  Assuming 
they are steps, the first one is wrong.

The parser looks at a string and breaks it down into tokens and 
subexpressions, making what you later call an AST.  The first step in 
function evaluation is recognizing that something is a function call, 
not recognizing it as a function.  For example, "mean" is the name of a 
function and also an expression evaluating to a function, "mean(1:10)" 
is a function call.

Once you have a function call, the next step looks at the expression 
used to specify the function.  In "mean(1:10)", that expression is 
"mean", but it could be an arbitrary R expression.  If it is a name like 
"mean" (or a string), then R looks for an object of mode "function" of 
that name in the current evaluation frame, or its parent frames.  These 
are not "constructed"; the current evaluation frame is always known, and 
contains a pointer to its parent.  If the function is specified by a 
more complex expression (e.g. in "fn[[1]](1:10)", the expression is 
"fn[[1]]") then that expression is evaluated.  It needs to return a 
function object or an error will be generated.

So these work:

mean(1:10)
list(mean)[[1]](1:10)
"mean"(1:10)

and these don't:

list("mean")(1:10)
c("mean")(1:10)

So now we have the function.  Its name is irrelevant.
Functions have at least 3 parts, not 2.  They have formals, a body, and 
an environment.  Nowadays they will often have bytecode as well; this is 
a compiled version of the body used in its place during evaluation.
It is only parsed once.
I have no idea what you are saying in this paragraph.  Positional versus 
named matching has no effect on scoping.  Arguments specified in the 
call are scoped in the calling frame; default values for arguments are 
scoped in the evaluation frame.
You missed a step.  As evaluation starts, a new environment is created, 
the evaluation frame.  Its parent is the environment of the function; it 
is initialized with the formal arguments to the function as promises.

This is true for both standard and non-standard functions.  All 
arguments are parsed, standard or not, producing promises.  They are 
placed in the evaluation frame, not "passed into the body".
No, parse first, match second, put into evaluation frame third.
No.  Each formal is bound to a promise in the evaluation frame. 
Promises contain an expression (an AST in your terms) and an 
environment.  As previously mentioned, the environment will be the 
calling frame for arguments passed in the call, the evaluation frame for 
arguments specified via defaults.
No.  Arguments are all treated as promises, i.e. un-evaluated 
expressions with an attached environment.  No search is done until later 
when they are evaluated.
No, promises contain expressions, and references (pointers) to environments.
That sounds correct.
No.  The body is just an expression.  Typically it's a compound 
statement enclosed in braces, but not necessarily.  No substitutions are 
done.  Later when it is evaluated, symbols in that expression will be 
looked up in the evaluation frame.
This is unnecessarily complex.  Evaluation of the body expression is 
just like evaluation of any other expression.  What is special is that 
the evaluation frame is set as the current frame, and some of the 
objects in it are promises, which have their own special rules.
Again, unnecessary.
I would recommend separating observations like 1) from rules like 2). 
The rules are pretty simple.  The consequences of them can be more complex.

3) is just wrong.  Promises have environments where their expressions 
are evaluated.
Again, this is unnecessary.  The body is just an expression that is 
evaluated in the evaluation frame.
The basic difference between standard evaluation and nonstandard 
evaluation is whether the function looks at the expression in promises, 
or only looks at the value when it is evaluated.  substitute()  is the 
usual way to look at the expression, but packages like rlang define others.

Other issues that you haven't touched on that probably belong in a 
writeup like this are a description of how ... is handled, the rarely 
used ..1, ..2, etc., and the super-assignment operator <<-.

Duncan Murdoch