Skip to content
Back to formatted view

Raw Message

Message-ID: <58CAAD71-7EE8-4455-ADDA-81F10992DB82@me.com>
Date: 2011-08-15T19:13:10Z
From: Denis Chabot
Subject: accented vowels
In-Reply-To: <8BE248EA-761A-4951-BB33-4B1BF2B98302@globetrotter.net>

As a follow up, I tried this

a[2]
[1] "1_MO2 soles S?te sda.Rda"
b[2]
[1] "1_MO2 soles S?te sda.Rda"
a[2] == b[2]
[1] FALSE

Denis
Le 2011-08-15 ? 14:42, Denis Chabot a ?crit :

> Hi,
> 
> I usually do not give second thought to accented vowels and R handles everything fine thanks to UTF8 being used in my R scripts. But today I have a problem. Accented vowels do not behave properly when they were imported into R using list.files.
> 
> Maybe this is because  OS X (I'm using 10.6.8) still uses MacRoman for file names, though visually the names seem to have been read correctly into R.
> 
> An example is better than words:
> 
> sessionInfo()
> R version 2.13.1 (2011-07-08)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> 
> locale:
> [1] fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     
> 
> 
> This does not cause problem:
> a = c("1_MO2 crevettes po2crit.Rda", "1_MO2 soles S?te sda.Rda", "1_MO2 turbots po2crit.Rda"); a
> [1] "1_MO2 crevettes po2crit.Rda" "1_MO2 soles S?te sda.Rda"    "1_MO2 turbots po2crit.Rda"  
> 
> a2 = gsub(" S?te", "S", a); a2
> [1] "1_MO2 crevettes po2crit.Rda" "1_MO2 solesS sda.Rda"        "1_MO2 turbots po2crit.Rda"  
> 
> 
> but if instead of creating the vector within the R script, I read it as a series of file names, the substitution does not work. I am sorry that I cannot make this a reproducible example as it requires the 3 files to exist on your computer, but you could create 3 dummy files having the same names in the directory of your choice.
> 
> don = file.path("donn?es/")
> b = list.files(path = don, pattern = "1_MO2"); b
> [1] "1_MO2 crevettes po2crit.Rda" "1_MO2 soles S?te sda.Rda"     "1_MO2 turbots po2crit.Rda"  
> 
> b2 = gsub(" S?te", "S",  b); b2  
> [1] "1_MO2 crevettes po2crit.Rda" "1_MO2 soles S?te sda.Rda"     "1_MO2 turbots po2crit.Rda"  
> 
> I am puzzled and also "stuck". For now I'll modify the file name, but I need to be able to handle such names at some point.
> 
> Any advice?
> 
> thanks in advance,
> 
> Denis
>