Skip to content
Prev 49948 / 63421 Next

Native characterset is wrong for unicode builds for Windows

On 27/02/2015 2:31 AM, maillist at tlink.de wrote:
Because in Ubuntu, UTF-8 is the native encoding, and in your Windows
system, latin1 is the native encoding.

But I do agree that the format() issue is a problem.  I haven't traced
through the code, but I think the string "?????" is read using Windows
API functions that return a UTF-16 result, then converted by R to UTF-8.
 So format() should see that it is a UTF-8 string and not convert it to
the native encoding.  There is nothing wrong with enc2native(), it's
doing what you asked for.  The problem is that format() is using it.

Duncan Murdoch