Skip to content

Inconsistency in gsub in R.2.6.2 (PR#10978)

1 message · christian.buchta at wu-wien.ac.at

#
This is a multi-part message in MIME format.
--------------040104050805010601010607
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit


Hi,

May this be an oversight?

R version 2.6.2 Patched (2008-03-13 r44783)
Copyright (C) 2008 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

...

 > x <- "ab?"
 > Encoding(x)
[1] "latin1"
 > Encoding(gsub("?","", x))
[1] "unknown"
 > Encoding(gsub("?","", x, perl = TRUE))
[1] "latin1"

The code in src/main/pcre.c (see also do_tolower and do_strsplit in 
src/main/character.c) suggests to patch as attached.

 > x <- "ab?"
 > Encoding(gsub("?","", x))
[1] "latin1"


Happy Easter

Christian