Parsing code with newlines
On Wed, Apr 10, 2019 at 5:06 AM, Tomas Kalibera <tomas.kalibera at gmail.com> wrote:
This is my first post here. I came across the very same problem. It can be reproduced within modified tests/Embedding/RParseEval.c
Please check https://www.r-project.org/posting-guide.html and update your post if you still need to get help here - from your current post I am not sure what you did, what was the error you got and from which tool, why you think the error was a result of something not working correctly/as documented, etc. The original post with the same subject you are probably referring to had the same problem.
The original post is linked via e-mail headers however it goes back a decade. It shows up linked as a thread alright in Gnus. Hence I thought it would be alright to jump straight to the matter. Here is the link to original discussion https://stat.ethz.ch/pipermail/r-devel/2008-August/050332.html At this point, I would like to report two bugs in "Writing R Extensions" documentation. From that document it is not clear why line feeds (0x0A) have to be removed from the input string to be parsed. Also nowhere in that document it mentions R_TopLevelExec if parsing needs to be done in the outer context. That is not when our C function is called from R, but when we are trying to parse R code in C directly outside of main loop. These are big show stoppers for newcomers. The barely modified test code I had in my previous post, does not parse what would seem a legit sample string "\r\n ls()". However, it does parse alright "\n ls()". Nowhere in the docs the intolerance to line feeds is mentioned. It is reproducible from R console as well. ,----[ R console session ] | > parse(text="\r\n ls()") | Error in parse(text = "\r\n ls()") : <text>:1:1: unexpected input | 1: | ^ | > `---- Another problem with the aforementioned documentation is parsing erroneous expressions like "deadbeef<-function(,bad){}" in top level context. Instead of returning an error from parsing, it crashes (with R_suicide) unless the call is wrapped in R_TopLevelExec.
Please also note that "tests" (tests/Embedding/RParseEval.c) are not examples - if they do not catch R errors in some cases that is perfectly ok, they also may use internal API that is indeed not documented e.g. in Writing R Extensions.
Where would be a good example on top level context parsing then? I have no problems skipping error checks and/or with the use of undocumented functions. However I would rather prefer to avoid major unexpected crashes. That example does NOT use any of the undocumented API and therefore is misleading. I believe it SHOULD include R_TopLevelExec and that function SHOULD be in the docs.
Note Writing R Extensions has a section on embedding R and on cleanup handlers.
I have no problems with the rest of the document on embedding and clean up in general.
Actually this example has another issue, namely it doesn't wrap
everything in R_ToplevelExec . This is a major show stopper for
newcomers as that function is barely mentioned anywhere and longjmp into
terminated setuploop function followed by R_suicide look like a mystery.
Error: bad value
Fatal error: unable to initialize the JIT
That aside, here is the code with newlines that fails to parse. I hope
it will paste alright here.
#include "embeddedRCall.h"
#include <R_ext/Parse.h>
int
main(int argc, char *argv[])
{
SEXP e, tmp;
int hadError;
ParseStatus status;
init_R(argc, argv);
PROTECT(tmp = mkString("\n\r ls()"));
PROTECT(e = R_ParseVector(tmp, 1, &status, R_NilValue));
if (status != PARSE_OK)
{
printf("boo boo\n");
}
else
{
PrintValue(e);
R_tryEval(VECTOR_ELT(e,0), R_GlobalEnv, &hadError);
}
UNPROTECT(2);
end_R();
return(0);
}
-- Mikhail