Error: invalid multibyte string
On 10/28/06, Henrik Bengtsson <hb at stat.berkeley.edu> wrote:
On 10/28/06, Thomas Lumley <tlumley at u.washington.edu> wrote:
On Fri, 27 Oct 2006, Henrik Bengtsson wrote:
In Section "Package subdirectories" in "Writing R Extensions" [2.4.0
(2006-10-10)] it says:
"Only ASCII characters (and the control characters tab, formfeed, LF
and CR) should be used in code files. Other characters are accepted in
comments, but then the comments may not be readable in e.g. a UTF-8
locale. Non-ASCII characters in object names will normally [1] fail
when the package is installed. Any byte will be allowed [2] in a
quoted character string (but \uxxxx escapes should not be used), but
non-ASCII character strings may not be usable in some locales and may
display incorrectly in others.", where the footnote [2] reads "It is
good practice to encode them as octal or hex escape sequences".
(Note: ASCII refers (correctly) to the 7-bit ASCII [0-127] and none of
the 8-bit ASCII extensions [128-255].)
According to sentense about quoted strings, the following R/*.R code
should still be valid:
pads <- sapply(0:64, FUN=function(x) paste(rep("\xFF", x), collapse=""));
That looks like it should be valid (at least according to the documentation), even though it won't run usefully on UTF-F locales. What you wrote before was:
On Thu, 26 Oct 2006, Henrik Bengtsson wrote:
I'm observing the following on different platforms:
parse(text='"\\x7F"')
expression("\177")
parse(text='"\\x80"')
Error: invalid multibyte string
and that error *is* correct behaviour -- you can't parse() something that isn't a valid character string.
Hmm... are you really sure? That should be a (double) quoted \x80 (four characters + quotes), which has been put in a (single) quoted string where backslash is escaped? Maybe it is more clear to write:
expr <- parse(text='x <- "\\x41"') eval(expr) print(x)
[1] "A" and same for
expr <- parse(text='x <- "\\x7F"') eval(expr) print(x) expr <- parse(text='x <- "\\x80"') eval(expr) print(x)
(Unfortunately I can't access the machines that gives me the errors right now, but I assume the error occurs when eval() is called.)
The error occurs when print():ing, i.e.
expr <- parse(text='x <- "\\x7F"') eval(expr) print(x)
[1] "\177"
expr <- parse(text='x <- "\\x80"') eval(expr) print(x)
[1]Error: invalid multibyte string /Henrik
/H
-thomas