Skip to content
Prev 8276 / 12125 Next

[R-pkg-devel] Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid

On 19/07/2022 2:23 p.m., Spencer Graves wrote:
That's all correct.
I think that advice is too specific to this example.  Here's a short 
explanation about what's going on:

Strings in R are assumed to be in the ASCII encoding if they have no 
bytes bigger than 128 (hex 80).

If they do have such bytes, they can be marked as "latin1", or "UTF-8", 
or not marked, in which case they are assumed to be in the local encoding.

If you write a string containing "\u....", it is marked as being in the 
"utf8" encoding.

If you write a string containing "\x....", it is not marked.

Thus if you are writing strings for others to use, you don't know how 
those strings will be interpreted unless you explicitly set their 
encoding.  For example, this is ambiguous:

    x <- "fa\xE7ile"

This is not:

    x <- "fa\xE7ile"
    Encoding(x) <- "latin1"

The advice you received to change the \x to \u works for your examples, 
but might fail in other examples.  As help("Quotes") says, "\uE7" is the 
Unicode code point hex E7, which is a c with a cedilla.  (The two hex 
digit Unicode values from 80 to FF match the Latin-1 values; but not 
everyone lives and works in a Latin-1 locale, so \xE7 might not be 
equivalent to \uE7 for some people.)

You can have 1-4 hex digits after \u.  If the next character happens to 
be a hex digit, you'll get some other character, e.g. "\uE7" is a ? (a c 
with a cedilla), but "\uE7a" is a single Thai character, and "\uE7ab" is 
some other single character (in the "private use area" of Unicode).

So it's safest to use exactly 4 hex digits as \u00E7, or to wrap the 
value in curly braces, \u{E7}.

Some Unicode characters need more than 4 hex digits.  Use \U for those.

Duncan Murdoch

Thread (14 messages)

Spencer Graves Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 18 Tomas Kalibera Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 19 Spencer Graves Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 19 Bill Dunlap Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 19 Spencer Graves Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 19 Bill Dunlap Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 19 Ben Bolker Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 19 Spencer Graves Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 19 Ivan Krylov Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 19 Spencer Graves Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 19 Jeff Newmiller Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 19 Ivan Krylov Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 19 Duncan Murdoch Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 19 Spencer Graves Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid Jul 19