Skip to content

parse/INSTALL feature request(s)

5 messages · Gabor Grothendieck, mwtoews at sfu.ca, Duncan Murdoch

#
Hi,
I have a feature request for 'parse', and possibly the 'R CMD INSTALL' 
command to display more informative error information. Specifically, 
after making several modifications to a package on my system (the 
package name is irrelevant; I'm using R 2.3.1 on Debian):
$ R CMD INSTALL seas
* Installing *source* package 'seas' ...
** R
** data
** inst
** preparing package for lazy loading
Error in parse(file, n, text, prompt) : syntax error at
1222:       ylim <-
1223:     else
Execution halted
ERROR: lazy loading failed for package 'seas'
** Removing '/usr/local/lib/R/site-library/seas'

Of the 24 or so *.R source files in the edited package, I'm not sure 
where I had made the syntax error, but it would be very nice to see an 
error message that says something more helpful, such as "Error in 
parse(file, n, text, prompt) : syntax error in function 'foo' at ...". 
Is it possible for 'parse' to display the parent function name in which 
the error occurred? As for 'R CMD INSTALL' goes, I realize that all the 
*.R files are cat'ed together, then parsed as one file, so it is 
difficult report a file name and line number(s). Is this concatenating 
necessary? Or couldn't the individual files be parsed (and errors/line 
#'s reported for offending files), then combined in an environment for 
the INSTALL'ed package?

In the mean time, my solution to find the offending file is:
for(i in dir(pattern=".R$")){
    print(i)
    parse(i)
}

Thanks,
+mt
#
On 10/20/2006 8:32 PM, Michael Toews wrote:
Yes, that would be nice, but I don't think it's possible:  if the parse 
fails, R won't know enough to know that it was in the middle of parsing 
function 'foo'.

What would be possible is to do like other preprocessors do, and put 
comments into the source file to indicate the origin of each line:  then 
the parser could tell you the location in the original file, which would 
be a lot more useful.  This would slow things down a bit and make the 
package bigger so we'd probably want it to be optional, but it would 
help a lot in situations like yours.
I don't know the reason for putting them in one file.
That looks like a good idea; if it doesn't catch the error, then it's 
probably something wrong with the last line in one of the files, e.g. 
not having a newline at the end of it.

Duncan Murdoch
#
On 10/20/06, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
I try to develop my functions one at a time for this very reason.  Of course
its not always feasible but if you can do it followed by an install each time
it makes it easier to locate problems.
#
I think what you are referring to is already nicely implemented (in 
R/src/main/source.c lines 89 to 97), and the errors are correctly shown 
in my example as lines 1222 to 1223. However, these line numbers are 
irrelevant to the original files, as they been concatenated into one 
large file (I think on line 765 of R/src/scripts/INSTALL.in ). This 
single file is loaded into an R environment (on line 790 of INSTALL.in), 
which is also the region where my original error from 'R CMD INSTALL' 
was thrown. The loading is done by the 'tools:::makeLazyLoading' 
function, which loads in the single R-source file into a fresh 
environment for the package.

I'm unsure of why a single R source file is needed in INSTALL (which 
limits the ability to show any offending file names and their line 
numbers). The functions in R/src/library/tools/R/makeLazyLoad.R could 
perhaps be made a bit more versatile to individually read the R source 
files for a package using a 'pattern="\.R$"', and throw an error from 
'sys.source' displaying the file name, and other error messages from 
'parse'. This is simply a feature-request suggestion, but as Gabor 
advised, I should really develop my functions file-by-file.
+mt
#
On 10/21/2006 6:36 PM, Michael Toews wrote:
Thank you.
No, that's not what I was referring to.  Preprocessors put lines like

#line 3 "header.h"

into their output so that the compiler can put useful debugging 
information into files when it processes them.  R could do that when it 
concatenates the files, and then tell you the origin of each line, not 
the relatively useless line number from the concatenated file.  (A very 
simple version would add a record like that on every line, and that 
would work with the current error reporting system; a slightly more 
sophisticated version would just put those lines in between each 
concatenated file, and then the error reporting would need to be made 
aware of them.)

  This
I'm reasonably hopeful that version 2.5.0 will have more source level 
debugging support in it, at least to the level of the "slightly more 
sophisticated version", and maybe better than that.

If you'd like to help with this, you can see the (currently extremely 
unstable and incomplete) code on the djm-source branch in the 
repository.  No support for "#line" yet.

Duncan Murdoch