Skip to content
Prev 387871 / 398502 Next

input string ... cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'?

On 22/04/2021 9:25 p.m., Spencer Graves wrote:
First, ANSI_X3.4-1968  is an official name for for a version of Ascii. 
It appears in the file near the start, where I believe it records the 
native encoding in place when the file was written, so readers using a 
different encoding can translate.

Your actual file appears to have been encoded in UTF-8, but not marked 
as such.  You're lucky you read it on macOS, where UTF-8 is the native 
encoding, since the reader probably recognized the bytes weren't ascii 
bytes (and warned you about that), then just left them alone.  If you 
read that file on Windows you'd likely get junk for those entries.

For your interest, here's a dump of the start of your file, after 
gunzipping it:

00000000  52 44 58 33 0a 58 0a 00  00 00 03 00 03 06 00 00 
|RDX3.X..........|
00000010  03 05 00 00 00 00 0e 41  4e 53 49 5f 58 33 2e 34 
|.......ANSI_X3.4|
00000020  2d 31 39 36 38 00 00 04  02 00 00 00 01 00 04 00 
|-1968...........|
00000030  09 00 00 00 01 78 00 00  03 13 00 00 00 10 00 00 
|.....x..........|
00000040  02 0e 00 00 02 6e 40 90  0c 00 00 00 00 00 40 90 
|.....n at .......@.|
00000050  44 00 00 00 00 00 40 10  00 00 00 00 00 00 40 7c 
|D..... at .......@||

Duncan Murdoch