Skip to content

access to R parse tree for Lisp-style macros?

3 messages · Andrew Piskorski, Duncan Murdoch, Thomas Lumley

#
R folks, I'm curious about possible support for Lisp-style macros in
R.  I'm aware of the "defmacro" support for S-Plus and R discussed
here:

  http://www.biostat.wustl.edu/archives/html/s-news/2002-10/msg00064.html 

but that's really just a syntactic short-cut to the run-time use of
substitute() and eval(), which you could manually put into a function
yourself if you cared too.  (AKA, not at all equivalent to Lisp
macros.)  The mlocal() function in mvbutils also has seemingly similar
macro-using-eval properties:

  http://cran.r-project.org/src/contrib/Descriptions/mvbutils.html 
  http://www.maths.lth.se/help/R/.R/library/mvbutils/html/mlocal.html 

I could of course pre-process R source code, either using a custom
script or something like M5:

  http://www.soe.ucsc.edu/~brucem/samples.html
  http://groups.google.com/group/comp.compilers/browse_thread/thread/8ece2f34620f7957/000475ab31140327

But that's not what I'm asking about here.  As I understand it,
Lisp-style macros manipulate the already-parsed syntax tree.  This
seems very uncommon in non-Lisp languages and environments, but some -
like Python - do have such support.  (I don't use Python, but I'm told
that its standard parser APIs are as powerful as Lisp macros, although
clunkier to use.)

Is implementing Lisp-style macros feasible in R?  Has anyone
investigated this or tried to do it?

What internal representation does R use for its parse tree, and how
could I go about manipulating it in some fashion, either at package
build time or at run time, in order to support true Lisp-style macros?

Whenever I try something like this in R:

  > dput(parse(text="1+2"))
  expression(1 + 2)

what I see looks exactly like R code - that '1 + 2' expression doesn't
look very "parsed" to me.  Is that really it, or is there some sort of
Scheme-like parse tree hiding underneath?  I see that the interactive
Read-Eval-Print loop basically calls R_Parse1() in "src/main/gram.c",
but from there I'm pretty much lost.

Also, what happens at package build time?  I know that R CMD INSTALL
generates binary *.rdb and *.rdx files for my package, but what do
those do exactly, and how do they relate to the REPL and R_Parse1()?

Finally, are there any docs describing the design and implementation
of the R internals?  Should I be looking anywhere other than the R
developer page here?:

  http://developer.r-project.org/

Thanks!
#
On 10/3/2005 3:25 AM, Andrew Piskorski wrote:
It is like a list of lists, with modes attached that say how they are to 
be interpreted.  parse() gives a list of mode "expression", containing a 
list of function calls or atomic objects.  Function calls are stored as 
a list whose head is the function name with subsequent entries being the 
arguments.

The mode may be "expression", or "call", or others, depending on what 
you are actually dealing with.
There's a parse tree underneath.  R is being helpful and deparsing it 
for you for display purposes.

To see it as a list, use "as.list" to strip off the mode, e.g.

 > as.list(parse(text="1+2"))
[[1]]
1 + 2

# A list containing one expression.  Expand it:

 > as.list(parse(text="1+2")[[1]])
[[1]]
`+`

[[2]]
[1] 1

[[3]]
[1] 2

# A function call to `+` with two arguments.  The arguments are atomic.

Use "mode" to work out how these are interpreted:

 > mode(parse(text="1+2"))
[1] "expression"
 > mode(parse(text="1+2")[[1]])
[1] "call"
The source code is sometimes the best place for low level details like 
this.  The R Language manual sometimes gives low level details, but is 
is uneven in its coverage; I forget if it covers this.

Duncan Murdoch
#
On Mon, 3 Oct 2005, Duncan Murdoch wrote:

            
Well, yes and no.  It is a syntactic shortcut using functions, but what it 
does is manipulate and then evaluate pieces of parse tree.  It doesn't 
have the efficiency under compilation that real macros would, but we don't 
have compilation.  It doesn't have gensyms, but again, R fails to support 
these in a fairly fundamental way, so they have to be faked using 
variables with weird random names.

I have a long-term plan to add real macros, but not until after Luke 
Tierney's byte-code compiler is finished.

 	-thomas