Problem with UTF-8 text in the Rcmdr package
The issue appears to be the Rcmdr output window and menus. They are done using Tcl/Tk, not by R. So this might be a problem in Tcl/Tk or the fonts it uses, or it might be problem with what Rcmdr passes to the tcltk package. We need the means to reproduce this (as per the posting guide): - what OSes are affected? Does this occur in a UTF-8 locale on Linux, for example? - in what locales? - what versions of Tcl/Tk? Note that shipped with Windows R changed between 2.5.1 and 2.7.x. - Is this anything to do with translations? I've not looked at how translations are done in Rcmdr, but if gettext() is used, the string passed to R for output is in the native encoding, so 'UTF-8 characters' is incorrect. It is possible that it is an iconv problem if the translations are supplied in UTF-8 and not Latin-2. There are far too many layers involved here to guess at what is going on. My guess is that it ought to be possible to give a simple example of a string which can be output to the Rcmdr console and will be rendered incorrectly (together with a screen shot of how it is rendered). I think the characters referred to are the Unicode glyphs 's and z with caron', \u0161 and \u017E. It seems that these will only be displayable in Rcmdr on Windows in a Latin-2 locale, which I do not have set up on Windows (but believe I could get installed). However, examples using that (and the menus) seem to be correct in both sl_SI.iso88592 and sl_SI.utf8 on Linux, which suggests that this is probably not an R issue but a Tcl/Tk one.
On Fri, 5 Sep 2008, John Fox wrote:
Dear list members, I've attached some email correspondence with Jaro Lajovic (with his permission), detailing a problem with the Slovenian translation file for the Rcmdr package.
Unfortunately, it is not 'detailed', and we do need the details.
In brief, while certain UTF-8 characters used in Slovenian used to appear properly in older versions of R, some characters do not display properly in the Rcmdr menus and output window under R 2.7.x. I've confirmed the problem with the current version of the Rcmdr package (1.4-0) and R 2.7.2 under Windows Vista. I've checked the R docs and NEWS file for changes to R, but wasn't able to turn up anything that seemed relevant. Frankly, however, my understanding of how various character sets are handled is only partial. Any help would be appreciated. John ------------------------------ John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -----Original Message----- From: Jaro.Lajovic [mailto:Jaro.Lajovic at mf.uni-lj.si] Sent: August-26-08 2:57 AM To: John Fox Subject: Re: Slovenian Rcmdr .po and .mo - and a problem Dear John,
That seems to imply that there's a change in R rather than in the Rcmdr that produced this problem. Do you notice the problem with any other packages that use translation or with R itself?
As for other translated R packages, I am afraid I am not aware of any. However, a quick test using cat with special characters: cat "??????\n" reveals that the string prints OK in the R (2.7.1.) console. The command line also shows OK in the Rcmdr Script window, but does not display right in the Output window. Special chars also fail in the Messages window. Input (Script window) thus seems not to be affected, while the menu system and output do not work properly. Thank you very much, Jaro
On Mon, 25 Aug 2008 21:54:43 +0200 "Jaro.Lajovic" <Jaro.Lajovic at mf.uni-lj.si> wrote:
Dear John,
One question though: I assume from your message that the previous version of the Rcmdr worked OK with R 2.7.1. Is that right?
No, the version 1.3-5 (that I still have with R 2.5.1) does not work with R 2.7.1 either. So: Rcmdr 1.3-5 with R 2.5.1: works OK. Rcmdr 1.3-5 with R 2.7.1: does not work properly. Rcmdr 1.4-0 with R 2.7.1: does not work properly. Thank you in advance, Jaro
On Mon, 25 Aug 2008 18:52:32 +0200 "Jaro.Lajovic" <Jaro.Lajovic at mf.uni-lj.si> wrote:
Dear John, Please find attached zipped Slovenian versions of .po (plain text
and
UTF-8 coded text) and .mo files. However, there seems to be a problem I have not been able to
resolve.
While special characters display properly under R version 2.5.1
with
Rcmdr 1.3-5, they fail to display (= are substituted by black
blocks)
under R version 2.7.1 with the new Rcmdr 1.4-0. By the way: the
.mo
file of the ver. 1.3-5 copied to 1.4-0 also failed to display properly. (An additional detail: three special characters that are used in
the
Slo version are c, s and z with hacek. c with hacek is not
affected,
it is just s and z with hacek that are not displayed OK.) Your advice will be much appreciated. With best regards, Jaro
-------------------------------- John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595