Plotting the ASCII character set.
On Sun, 4 Jul 2021 13:59:49 +1200
Rolf Turner <r.turner at auckland.ac.nz> wrote:
a substantial number of the characters are displayed as a wee rectangle containing a 2 x 2 array of digits such as
0 0 8 0
Interesting. I didn't pay attention to it at first, but now I see that a range of code points, U+0080 to U+009F, corresponds to control characters (also, 0+00A0 is non-breakable space), not anything printable. Also, Latin-1 doesn't define any meaning for bytes 0x80..0x9f, but here they are decoded to same-valued Unicode code points. And the actual code point for ? is U+20AC, not even close to what we're working with.
Also note that there is a bit of difference between the results of using Encoding() and the results of using iconv()
You are right. I didn't know that, but my reading of the function translateToNative in src/main/sysutils.c suggests that R decodes strings marked as 'latin1' as Windows-1252 (if it's available for the system iconv()) and uses the actual Latin-1 as a fallback. ?Encoding does warn that 'latin1' is ambiguous and system-dependent with regards to bytes 0x80..0x9f, so text() seems to be right to use Latin-1 and not Windows-1252 when trying to plot byte 0x80 encoded as CE_LATIN1 as U+0080. Although there's a /* FIXME: allow CP1252? */ comment in src/main/sysutils.c, function reEnc, which is used by text().
Is there any way that I can get the Euro symbol to display correctly in such a graphic?
I think that iconv(a, 'CP1252', '', '\ufffd') should work for you. At least it seems to work for the ? sign. It does leave the following bytes undefined, represented as ? U+FFFD REPLACEMENT CHARACTER: as.raw(which(is.na( iconv(sapply(as.raw(1:255), rawToChar), 'CP1252', '') ))) # [1] 81 8d 8f 90 9d Not sure what can be done about those. With Latin-1, they would correspond to unprintable control characters anyway.
Best regards, Ivan