Skip to content

use of Encoding()?

4 messages · Tilmann Faul, Olivier Crouzet, David Winsemius

#
Hey,

this is my first question here, so forgive me if i my be clumsy.

I want  to use Encoding to set the encoding of a character vector, but
it doese not seem to work. See example.
[1] "unknown"
[1] "unknown"

Is this intended?
Actually i want to change encoding of a character vector generated by
list.file on a linux computerwith UTF-8 file encoding, rstudio encoding
is iso8859-15.
Any hints?

best Tilmann
#
Hi,

using R version 3.3.2 under Linux, these work perfectly (but I receive
a correct encoding ("UTF-8"), not "unknown"). 

What is your system (windows, mac, linux)? Your R version? Which
interface (RStudio, Windows R interface)? There are often issues with
character encoding using Windows (in many different programming
languages) but it may not be the case concerning R.

If these operations are meant to read data from a file, you may
alternatively consider the option fileEncoding= from read.table /
read.csv (to change encoding) or, perhaps but I would
suggets first trying the preceding option, encoding= (to specifically
declare the file encoding if you know it but R does not detect it).

Olivier.


On Fri, 3 Feb 2017 17:29:20 +0100 Tilmann Faul
<Tilmann_Faul at t-online.de> wrote:

            

  
    
#
I'm wondering if it's being done on a Mac, since I see the same behavior at my console (the "standard" R.app GUI). If the issue is with reading a Windows file while using one of the `read.*` functions, then setting the `fileEncoding` parameter to one of 'iso-8859-1' or 'cp1252' may be attempted.

The ?Encodings page says: "ASCII strings will never be marked with a declared encoding, since their representation is the same in all supported encodings."

Running the example in the help page (on a Mac):
[1] "unknown"
[1] "fa?ile"
[1] "latin1"
#
On Fri, 3 Feb 2017 11:23:02 -0800
David Winsemius <dwinsemius at comcast.net> wrote:
Oups, I (erroneously) tried with accented characters, which explains my
answer. Actually, I (correctly) get "unknown" if using characters from
the ASCII set, so my understanding is that there's actually no problem
with the OP's request as there's no reason why "16-03-02" should be
represented as anything else than "unknown" according to this
information (all characters are in the ASCII set).

Olivier.