Skip to content
Prev 49945 / 63421 Next

Native characterset is wrong for unicode builds for Windows

On 26/02/2015 6:34 PM, maillist at tlink.de wrote:
R uses those functions, so I guess it is a "unicode application".  But
internally it uses an 8 bit encoding (normally the native one for the
platform it is running on, which in your case is apparently latin1).
Windows 95 had UCS-2 support, which was pretty close to UTF-16.

But this line of operating systems is dead for 10 years
So "unicode application" is something you just made up.

If you use Windows development tools, they have macros to convert
generic functions to either A or W versions.  R doesn't use those.  It
calls the W functions when it has UTF-16 characters, and A functions
when it has native characters.  I would love it if R was a UTF-8
application, because it would make life so much simpler, but Windows
doesn't support that.  So R needs to do tons of conversions.  If you
don't like that, you probably need to stick with Ubuntu.

Duncan Murdoch