Skip to content
Prev 388449 / 398513 Next

Plotting the ASCII character set.

Hello Rolf Turner,

On Sat, 3 Jul 2021 14:02:59 +1200
Rolf Turner <r.turner at auckland.ac.nz> wrote:

            
Part of the problem is that the "\xb0" byte is not in ASCII, which
covers only the lower half of possible 8-bit bytes. I guess that the
strings containing bytes with highest bit set used to be interpreted as
Latin-1 on your machine, but now get interpreted as UTF-8, which
changes their meaning (in UTF-8, the highest bit being set indicates
that there will be more bytes to follow, making the string invalid if
there is none).

The good news is, since it's Latin-1, which is natively supported by R,
there are even multiple options:

1. Mark the string as Latin-1 by setting Encoding(a) <- 'latin1' and
let R do the re-encoding if and when Pango asks it for a UTF-8-encoded
string.

2. Decode Latin-1 into the locale encoding by using iconv(a, 'latin1',
'') (or set the third parameter to 'UTF-8', which would give almost the
same result on a machine with a UTF-8 locale). The result is, again, a
string where Encoding(a) matches the truth. Explicitly setting UTF-8
may be preferable on Windows machines running pre-UCRT builds of R
where the locale encoding may not contain all Latin-1 characters, but
that's not a problem for you, as far as I know.

For any encoding other than Latin-1 or UTF-8, option (2) is still valid.

I have verified that your example works on my GNU/Linux system with a
UTF-8 locale if I use either option.