Skip to content

R_parseVector and syntax error [was: error messages while parsing with rniParse]

7 messages · Simon Urbanek, Duncan Murdoch, Romain Francois

#
Hello,

[I'm redirecting this here from stats-rosuda-devel]

When parsing R code through R_parseVector and the code generates an 
error (syntax error), is there a way to grab the error.
It looks like yyerror populates the buffer "R_ParseErrorMsg", but then 
the variable is not part of the public api.

Would it be possible to add yet another entry point to the parser that 
would basically wrap R_parseVector so that it would have an extra char* 
argument that would bring back the error message if there is an error?

Romain
Simon Urbanek wrote:

  
    
#
Romain Francois wrote:
I would oppose that.  Suggest ways to reduce the complexity of the 
parser interface and I'd be interested.  It's a nightmare to make any 
changes there.

You can always call the R function wrapped in try(), so it's not as 
though this would give you anything that you don't already have access to. 

Duncan Murdoch
#
On Jun 18, 2009, at 17:02 , Duncan Murdoch wrote:

            
I'm not quite following - we're talking about R_ParseVector in C code  
so the point is that the C code gets access to the error message so it  
can relay it to the user. There are no R-level functions involved  
here. The issue here for the moment is that this information is  
retrievable at R level but not (officially) at the C level. As for  
reducing complexity - technically, there is no complexity added since  
all this is already in place ... [adding extra char * argument to  
ParseVector may not be the best way but that's not what I'm arguing  
for]. Or am I missing something?

Cheers,
S
#
Simon Urbanek wrote:
I understood that.  But the C code can get the error message by 
evaluating an R expression and looking at the result.
I wouldn't mind exposing the underlying information in a clean way, but 
the string in R_ParseVector isn't all a front end should get. 

At the time of an R_ParseVector syntax error, the parser knows what 
token it couldn't handle, and it knows its classification, and the 
location in the file where it came from.   Not all of that makes it 
through to the error message.
It was what I was arguing against.

Duncan Murdoch
#
Duncan Murdoch wrote:
Great. Let's do that.
Is a function that simply returns some of the static variables used by 
bison clean enough ?

  
    
#
Romain Francois wrote:
It could be.   I'd like a design that allows for the possibility of 
multiple syntax errors to be reported.  I have parse_Rd doing that, 
though not committed yet.  parse() is different because we have to be 
less tolerant of errors in R code than in Rd files.  But we could still 
report multiple errors in one parse, not just stop at the first one.

Duncan Murdoch
#
Duncan Murdoch wrote:
This is an interesting problem. Just being curious here: how do you 
continue parsing after a syntax error in parse ? Does it depend on the 
kind of syntax error ? Do you use some of the recovery protocols of 
bison (the special "error" token only appears in the very top level prog 
symbol :

prog    :    END_OF_INPUT            { return 0; }
    |    '\n'                { return xxvalue(NULL,2,NULL); }
    |    expr_or_assign '\n'            { return xxvalue($1,3,&@1); }
    |    expr_or_assign ';'            { return xxvalue($1,4,&@1); }
    |    error                 { YYABORT; }
    ;


Anyway, what about using the extra information to structure an error 
message of a custom condition class.