Skip to content
Prev 60435 / 63424 Next

gsub() hex character range problems in R-devel?

I'm not very good at character encoding / etc so this might be user error. The following code is meant to replace extended ASCII characters, in particular a non-breaking space, with "", and it works in R-4-1-branch
[1] "R version 4.1.2 Patched (2022-01-04 r81445)"
[1] "foo"

but fails in R-devel
[1] "R Under development (unstable) (2022-01-04 r81445)"
Error in gsub("[\177-\xff]", "", "fo\xa0o") : invalid regular expression '[-?]', reason 'Invalid character range'
In addition: Warning message:
In gsub("[\177-\xff]", "", "fo\xa0o") :
  TRE pattern compilation error 'Invalid character range'

There are other oddities, too, like
[1] "\xfc\xbe\x8c\x86\x84\xbc"
[1] "<>"

The R-devel sessionInfo is
R Under development (unstable) (2022-01-04 r81445)
Platform: x86_64-apple-darwin19.6.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /Users/ma38727/bin/R-devel/lib/libRblas.dylib
LAPACK: /Users/ma38727/bin/R-devel/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.2.0

(I have built my own R on macOS; similar behavior is observed on a Linux machine)

Any hints welcome,

Martin Morgan