Skip to content
Prev 391039 / 398506 Next

Converting two byte encoding to UTF-8

I have solved it!

First, the bytes I have are offset by 0x80 from what they should 
contain.  The actual encoding of ? is 0x30 0x21.  But subtracting 0x80 
isn't enough; they are still treated as two characters:

 > iconv(as.raw(result[[1]]$kanji-0x80), from = "JIS_X0208-1990", 
to="UTF-8")
[1] "?" "?"

However, if I put those bytes in a list entry, it works:

 > iconv(list(as.raw(result[[1]]$kanji-0x80)), from = "JIS_X0208-1990", 
to="UTF-8")
[1] "?"

Duncan Murdoch
On 19/03/2022 6:52 a.m., Duncan Murdoch wrote: