Skip to content
Back to formatted view

Raw Message

Message-ID: <A9D3E344-3B44-42C8-97EC-978AE5D2246D@r-project.org>
Date: 2007-12-27T20:39:42Z
From: Simon Urbanek
Subject: encoding question again
In-Reply-To: <004201c848c4$b2fbb790$15b2a8c0@lifebook2>

Matthias,

you get exactly what you specified - namely UTF-8. If you want your  
html file to be latin1, then you have to say so:

zz = file( paste("Itemtabelle/Itemtabelle", abt, ".html"), "wt",  
encoding = "latin1")

In addition, you're assuming that `abt' is in the correct encoding to  
be understood by your OS. If it's not, you better convert it into one.  
 From your results it seems as if `abt' is also UTF-8 encoded. Since  
you didn't tell us where you got that from, you should either fix the  
source or use something like iconv(abt,"utf-8","latin1"):

(in UTF-8 locale)
 > abt="n?r"
 > cat(abt,"\n")
n?r
 > charToRaw(abt)
[1] 6e c3 bc 72
 > charToRaw(iconv(abt,"utf-8","latin1"))
[1] 6e fc 72

Cheers,
Simon


On Dec 27, 2007, at 3:11 PM, Matthias Wendel wrote:

> Hi, R Devils,
> I'm running the actual R version in JGR (version 1.5-8 ).  
> Sys.getlocale(category = "LC_ALL") yields
> [1] "LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany. 
> 1252;LC_MONETARY=German_Germany. 
> 1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252"
>
> I want to write some HTML-Code enhanced by statistical results and  
> labels encoded in Latin-1, which I pass to a function. Some label  
> shall generate the filename. Although the labels are correctly  
> handled in JGR they are somehow converted when they are written to  
> the file. Also the filename is not constructed as wanted. The  
> function definition is correctly sourced into R. The function is  
> defined like this:
>
> Itemtabelle.head <- function (abt ){
>   # n?r z?m T?ST
>   zz = file( paste("Itemtabelle/Itemtabelle", abt, ".html"), "wt",  
> encoding = "UTF-8")
>   cat(as.character("<html xmlns:o=\"urn:schemas-microsoft-com:office:office 
> \" xmlns:x=\"urn:schemas-microsoft-com:office:excel\" xmlns=\"http://www.w3.org/TR/REC-html40 
> \">  \n"),
>       as.character("    
> < 
> head 
> > 
>                                                                                                                                                 \n 
> "),
> 		.
> 		.
> 		.
>       as.character("        <td colspan=5 class=xl28 width=727 style= 
> \'width:545pt\'>Gesundheitsindikatoren:  "), abt, as.character("</ 
> td>                                   \n"),
>       as.character("       </ 
> tr 
> > 
>                                                                                                                                                "), file 
>  = zz)
>       close(zz)
>       unlink(zz)
> }
> Setting abt as " ?rzte Innere, Gyn?kologie" and calling the function  
> with this argument, yields a filename "Itemtabelle  ??rzte Innere,  
> Gyn??kologie .html" and in the file a line
>         <td colspan=5 class=xl28 width=727 style='width: 
> 545pt'>Gesundheitsindikatoren:    ????rzte Innere, Gyn????kologie </ 
> td>
> is generated.                                 .
> I tried to solve this by using iconv, without success.
> The problem remains the same in the rgui and rterm - in rterm the  
> resulting filename is "Itemtabelle ?rzte Innere, Gyn?kologie  .html".
>
> Cheers,
> Matthias
>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel