Message-ID: <20220124112342.6673a75d@parabola>
Date: 2022-01-24T08:23:42Z
From: Ivan Krylov
Subject: [R-pkg-devel] ASCII code for Degree symbol °
In-Reply-To: <007601d810c7$6050bd30$20f23790$@gmail.com>
On Sun, 23 Jan 2022 21:09:09 -0500
<dbosak01 at gmail.com> wrote:
> vec1 <- gsub("[\xB0]", ".", vec)
A great degree of care is needed with this.
Encoding('\xB0') is "unknown", i.e. \xXX escape codes are assumed to be
bytes in your native system encoding. On GNU/Linux and other systems
where native encoding is UTF-8 (and not Latin-1 or an ANSI code page),
'\xB0' is an invalid byte sequence, not a degree symbol:
'\xB0' == '?'
# [1] FALSE
On the other hand, the code point for ? is also U+00B0:
as.hexmode(utf8ToInt('?'))
# [1] "b0"
'\ub0'
# [1] "?"
The difference is that Encoding('\ub0') is "UTF-8" and is therefore
portable between systems with different native encodings.
--
Best regards,
Ivan