Skip to content

grep problem in R-devel 2.14 r57004

3 messages · Mark Bravington, Simon Urbanek

#
Problem below with PCRE grep in R-devel; works fine in R-patched. (Unless there's been an absolutely massive change in rules for updated PCRE version 8.13; jeez I hope not)
Error in grep("[.][.]", "", perl = TRUE) :
  invalid regular expression '[.][.]'
In addition: Warning message:
In grep("[.][.]", "", perl = TRUE) : PCRE pattern compilation error
        'POSIX collating elements are not supported'
        at '[.][.]'
R Under development (unstable) (2011-09-13 r57004)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C
[5] LC_TIME=English_Australia.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

NB I'm sending to R-devel rather than posting a bug report because (i) I have a dim recollection that's what we're supposed to do for bugs in R-devel, and (ii) Bugzilla doesn't include an R-devel version and (iii) couldn't find any guidance on these matters.

Mark
#
Mark, quick googling gives the answer - [.] is not what you think it is, you probably meant [\.]. Bracket expressions starting with [. are collating symbols which is unsupported by PCRE (only [:xxx:] is supported, neither [=xxx=] nor [.xxx.] is) but that's probably not what you intended. See POSIX:

9.3.5 RE Bracket Expression
[...]
1. [..] The character sequences "[.", "[=", and "[:" (left-bracket followed by a period, equals-sign, or colon) shall be special inside a bracket expression and are used to delimit collating symbols, equivalence class expressions, and character class expressions.

Cheers,
Simon
On Sep 16, 2011, at 12:45 AM, <Mark.Bravington at csiro.au> <Mark.Bravington at csiro.au> wrote:

            
#
I forgot to mention the more obvious ;) - yes, it is a known issue in PCRE 8.13 which is hitting more people.
After re-reading the standard I think the problem was that PCRE did not require enclosing [ to treat [. as special.  This has been addressed in the PCRE trunk since and it also has a comment on what happened. I have ported that fix into R-devel. 

Cheers,
Simon
On Sep 16, 2011, at 9:01 AM, Simon Urbanek wrote: