Skip to content

Changes to parser in R-devel

5 messages · Duncan Murdoch, Yihui Xie

#
I have just committed (in r59883) some changes to the R parser based on 
Romain Francois' parser package.  Packages that made use of parser will 
hopefully find that the information in base R gives them what they need 
to work with, but the data is not identical to
what parser recorded (since it was not consistent with some things 
already in R).  One reason for the change was that the parser in the 
parser package was slightly different than the one in R; the hope is 
that by providing the services in R, it will make maintenance easier for 
things like code analysis, pretty printing, etc.

See ?getParseData for details, and if you are maintaining a package that 
depends on parser, feel free to ask me for help in the transition, or 
make suggestions for changes if I've done something that causes you too 
much trouble.

Duncan Murdoch

P.S. to Qiang Li:  as mentioned privately, the goal for this change was 
to reproduce output equivalent to what parser did, so I have not 
incorporated your suggested change to outlaw expressions like "x[[1] ]"  
(with an embedded space where it shouldn't be).  After things settle 
down we can consider that change and others.
1 day later
#
I'm not sure if there is a bug somewhere; see this example:

getParseData(parse(text='function(x){}'))

  line1 col1 line2 col2 id parent          token terminal     text
1     1    1     1    8  1     11       FUNCTION     TRUE function
2     1    9     1    9  2     11            '('     TRUE        (
3     1   10     1   10  3      5 SYMBOL_FORMALS     TRUE        x
4     1   11     1   11  4     11            ')'     TRUE        )
5     1   12     1   12  6      8            '{'     TRUE        {
6     1   13     1   13  7      8            '}'     TRUE        }
7     1   12     1   12  5     11            '}'     TRUE        {
8     1   12     1   13  8     11           expr    FALSE
9     1    1     1   13 11      0           expr    FALSE

I get an additional { in the 7th row of the 'text' column.

Another problem is that for this empty function below, there will be
an obvious pause if you run it more than once:

getParseData(parse(text='function(){}'))

and you may get wild line/col numbers like this:

   line1 col1     line2 col2 id parent    token terminal     text
1      1    1         1    8  1      9 FUNCTION     TRUE function
2      1    9         1    9  2      9      '('     TRUE        (
3      1   10         1   10  3      9      ')'     TRUE        )
4      1   11         1   11  4      6      '{'     TRUE        {
5      1   12         1   12  5      6      '}'     TRUE        }
6 320024   11 140106360   11 11      9      '}'     TRUE
7      1   11         1   12  6      9     expr    FALSE
8      1    1         1   12  9     11     expr    FALSE

What is worse is it can crash R:

 *** caught segfault ***
address 0x9488c20, cause 'memory not mapped'

Traceback:
 1: parse(text = "function(){}")
 2: getSrcref(x)
 3: getSrcfile(x)
 4: getParseData(parse(text = "function(){}"))
R Under development (unstable) (2012-07-18 r59904)
Platform: i686-pc-linux-gnu (32-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base


Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA


On Wed, Jul 18, 2012 at 2:31 PM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
#
On 12-07-19 4:41 PM, Yihui Xie wrote:
There's definitely a bug in the handling of empty lists, such as the 
empty list of commands in your first example and the empty list of 
arguments in your second.  There's a partial workaround currently in 
R-devel, but not a perfect fix.  (This is due to me missing a conversion 
from Romain's 0-based column counting to the usual 1-based counting.)

I expect it will be fixed tomorrow, or sooner.

Duncan Murdoch
#
On 19/07/2012 6:50 PM, Duncan Murdoch wrote:
As far as I know, it is now fixed (in r59913).

Duncan Murdoch
#
Great. I just tested it and did not find any more problems. Thanks!

Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA


On Fri, Jul 20, 2012 at 12:22 PM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote: