Skip to content
Back to formatted view

Raw Message

Message-ID: <52AABE4D.5010500@stats.ox.ac.uk>
Date: 2013-12-13T07:59:09Z
From: Brian Ripley
Subject: charToRaw("Œ") is not 8C in R console
In-Reply-To: <tencent_2760AA6767F679DD32E1FC82@qq.com>

On 13/12/2013 07:03, ???? wrote:
> in http://www.ascii-code.com/, you can see the the hex value of ?? is 8C,

I don't see that: that is two characters and they are C5 and 92 in that 
table.  8C is a AE ligature, there.

And what the 'hex value' is depends on the locale: see the preamble of 
that table (which seems to assume everyone uses CP1252): you have not 
stated yours.

> why in my R console ?
> charToRaw("??")
>   [1] c5 92
>   is not 8C ?

Because R is better at looking up hex values than you are.

I get

 > charToRaw("??")
[1] c3 85 e2 80 99

in UTF-8 (as will almost everyone not using Windows).

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595