Message-ID: <CABtg=K=_w9rqqua5V-dqB7w7wiQpXaMub7MknNpzbU+rP_8fPQ@mail.gmail.com>
Date: 2022-02-21T10:33:30Z
From: Gábor Csárdi
Subject: deparse() and UTF-8 strings
I am wondering if it would make sense to produce \u escaped strings in
deparse() for UTF-8 input. Currently we have (in R-devel):
x <- "G\u00e1bor"
Sys.setlocale("LC_ALL", "C")
#> [1] "C/C/C/C/C/en_US.UTF-8"
deparse(x)
#> [1] "\"G<U+00E1>bor\""
charToRaw(deparse(x))
#> [1] 22 47 3c 55 2b 30 30 45 31 3e 62 6f 72 22
Is there a reason why this is preferable instead of returning
"\"G\\u00e1bor\""
i.e.
charToRaw("\"G\\u00e1bor\"")
#> [1] 22 47 5c 75 30 30 65 31 62 6f 72 22
Returning the \u escaped form would make deparse() the inverse of
parse(), at least in this respect.
Thank you,
Gabor