match function causing bad performance when using tablefunction on factors with multibyte characters on Windows
Matthew Dowle wrote:
I'm not sure, but note the difference in locale between Linux (UTF-8) and Windows (non UTF-8). As far as I understand it R much prefers UTF-8, which Windows doesn't natively support. Otherwise you could just change your Windows locale to a UTF-8 locale to make R happier.
[...]
If anybody knows a way to trick R on Linux into thinking it has an encoding similar to Windows then I may be able to take a look if I can reproduce the problem in Linux.
Changing the locale to an ISO 8859-1 locale, i.e.: export LC_ALL="en_US.ISO-8859-1" export LANG="en_US.ISO-8859-1" I could *not* reproduce it; that is, ?table? is as fast on the non-ASCII factor as it is on the ASCII factor.
Karl Ove Hufthammer