Dear developeRs, Compilation of the latest version (0.9-5) of my actuar package fails with r-release MacOS_X ix86 on CRAN; see http://www.R-project.org/nosvn/R.check/r-release-macosx-ix86/actuar-00check.html All errors come from accented letters in comments in latin-1 encoded files (except hierarc.R which is in UTF-8, my bad). Encoding is declared as latin-1 in DESCRIPTION. The package checks and compiles fine on Windows, Linux and, ironically, my MacOS X main development machine. I realize using non- ASCII characters in source files is not a good idea and I removed them, but I would appreciate any clue as to what went wrong with the compilation on CRAN. FWIW, > sessionInfo() R version 2.6.2 (2008-02-08) i386-apple-darwin8.10.1 locale: fr_CA.UTF-8/fr_CA.UTF-8/fr_CA.UTF-8/C/fr_CA.UTF-8/fr_CA.UTF-8 attached base packages: [1] stats utils datasets grDevices graphics methods base other attached packages: [1] CarbonEL_0.1-4 loaded via a namespace (and not attached): [1] tools_2.6.2 Thanks in advance! --- Vincent Goulet, Associate Professor ?cole d'actuariat Universit? Laval, Qu?bec Vincent.Goulet at act.ulaval.ca http://vgoulet.act.ulaval.ca
Small encoding question
6 messages · Vincent Goulet, Kurt Hornik, Brian Ripley +1 more
Vincent Goulet writes:
Dear developeRs, Compilation of the latest version (0.9-5) of my actuar package fails with r-release MacOS_X ix86 on CRAN; see
All errors come from accented letters in comments in latin-1 encoded files (except hierarc.R which is in UTF-8, my bad). Encoding is declared as latin-1 in DESCRIPTION.
The package checks and compiles fine on Windows, Linux and, ironically, my MacOS X main development machine. I realize using non- ASCII characters in source files is not a good idea and I removed them, but I would appreciate any clue as to what went wrong with the compilation on CRAN.
I assume that the MacOS X builds are done in a C locale? Best -k
FWIW,
sessionInfo()
R version 2.6.2 (2008-02-08) i386-apple-darwin8.10.1
locale: fr_CA.UTF-8/fr_CA.UTF-8/fr_CA.UTF-8/C/fr_CA.UTF-8/fr_CA.UTF-8
attached base packages: [1] stats utils datasets grDevices graphics methods base
other attached packages: [1] CarbonEL_0.1-4
loaded via a namespace (and not attached): [1] tools_2.6.2
Thanks in advance!
--- Vincent Goulet, Associate Professor ?cole d'actuariat Universit? Laval, Qu?bec Vincent.Goulet at act.ulaval.ca http://vgoulet.act.ulaval.ca
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
On Feb 14, 2008, at 2:45 PM, Kurt Hornik wrote:
Vincent Goulet writes:
Dear developeRs, Compilation of the latest version (0.9-5) of my actuar package fails with r-release MacOS_X ix86 on CRAN; see
All errors come from accented letters in comments in latin-1 encoded files (except hierarc.R which is in UTF-8, my bad). Encoding is declared as latin-1 in DESCRIPTION.
The package checks and compiles fine on Windows, Linux and, ironically, my MacOS X main development machine. I realize using non- ASCII characters in source files is not a good idea and I removed them, but I would appreciate any clue as to what went wrong with the compilation on CRAN.
I assume that the MacOS X builds are done in a C locale?
Yes - but isn't this very similar to the problem we have been talking about a while back? The check analyses were reporting an error although the code was fine (I think it boiled down to text connection I/O in the check scripts failing mysteriously due to the fact that it was using the wrong encoding) I'll have to check later today ... Cheers, S
FWIW,
sessionInfo()
R version 2.6.2 (2008-02-08) i386-apple-darwin8.10.1
locale: fr_CA.UTF-8/fr_CA.UTF-8/fr_CA.UTF-8/C/fr_CA.UTF-8/fr_CA.UTF-8
attached base packages: [1] stats utils datasets grDevices graphics methods base
other attached packages: [1] CarbonEL_0.1-4
loaded via a namespace (and not attached): [1] tools_2.6.2
Thanks in advance!
--- Vincent Goulet, Associate Professor ?cole d'actuariat Universit? Laval, Qu?bec Vincent.Goulet at act.ulaval.ca http://vgoulet.act.ulaval.ca
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
On Thu, 14 Feb 2008, Simon Urbanek wrote:
On Feb 14, 2008, at 2:45 PM, Kurt Hornik wrote:
Vincent Goulet writes:
Dear developeRs, Compilation of the latest version (0.9-5) of my actuar package fails with r-release MacOS_X ix86 on CRAN; see
All errors come from accented letters in comments in latin-1 encoded files (except hierarc.R which is in UTF-8, my bad). Encoding is declared as latin-1 in DESCRIPTION.
The package checks and compiles fine on Windows, Linux and, ironically, my MacOS X main development machine. I realize using non- ASCII characters in source files is not a good idea and I removed them, but I would appreciate any clue as to what went wrong with the compilation on CRAN.
I assume that the MacOS X builds are done in a C locale?
That was my first thought, but it worked in a C locale for me, even on Mac OS X. But then we know there are C locales and C locales .... I think R-devel is somewhat less prone to such issues, and it was R-devel I checked.
Yes - but isn't this very similar to the problem we have been talking about a while back? The check analyses were reporting an error although the code was fine (I think it boiled down to text connection I/O in the check scripts failing mysteriously due to the fact that it was using the wrong encoding) I'll have to check later today ... Cheers, S
FWIW,
sessionInfo()
R version 2.6.2 (2008-02-08) i386-apple-darwin8.10.1
locale: fr_CA.UTF-8/fr_CA.UTF-8/fr_CA.UTF-8/C/fr_CA.UTF-8/fr_CA.UTF-8
attached base packages: [1] stats utils datasets grDevices graphics methods base
other attached packages: [1] CarbonEL_0.1-4
loaded via a namespace (and not attached): [1] tools_2.6.2
Thanks in advance!
--- Vincent Goulet, Associate Professor ?cole d'actuariat Universit? Laval, Qu?bec Vincent.Goulet at act.ulaval.ca http://vgoulet.act.ulaval.ca
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
I think I found the cause, but fixing it may be more complicated (other than a hot fix for this particular case). What it boils down to is that the code for .check_package_code_syntax is trying to change the locale in a manner that doesn't work. In addition to that, the output of l10n_info() is wrong (for some definition of wrong), which complicates things even further. To top it all, if run in a UTF-8 locale, everything is just fine - that's why the package will pass check on "regular" OS X, because UTF-8 locale is the default since Leopard. .check_package_code_syntax() sees that the source requires Latin1, so it is checking whether the locale is utf-8, but it's not (because we force C) so it uses en_US. This may be the first problem, because en_US is not necessarily a latin1 locale at all (en_US.ISO8859-1 would be latin1 on OS X). However, the next problem is that l10n_info() is returning FALSE even for the (correct) latin1 locale and consequently(?) the reading fails. ginaz:~$ echo 'Sys.getlocale(); l10n_info()'|LANG=en_US.ISO8859-1 R -- vanilla --slave [1] "en_US.ISO8859-1/en_US.ISO8859-1/en_US.ISO8859-1/C/en_US.ISO8859-1/ en_US.ISO8859-1" $MBCS [1] FALSE $`UTF-8` [1] FALSE $`Latin-1` [1] FALSE en_US.ISO8859-1 *is* a latin-1 locale ... I was looking hard and found no way how to link (installed) locales to encodings - there is no official mapping and POSIX allows arbitrary locales (and names) .. Hence all locale names are merely loose conventions... so I'm not sure how can R even make such a decision (other than parse the name?). Anyway - a quick fix would be to force en_US.UTF-8 locale in that check for Mac OS X, but I think that doesn't fix the underlying problems ... Cheers, Simon
On Feb 14, 2008, at 3:09 PM, Simon Urbanek wrote:
On Feb 14, 2008, at 2:45 PM, Kurt Hornik wrote:
Vincent Goulet writes:
Dear developeRs, Compilation of the latest version (0.9-5) of my actuar package fails with r-release MacOS_X ix86 on CRAN; see
All errors come from accented letters in comments in latin-1 encoded files (except hierarc.R which is in UTF-8, my bad). Encoding is declared as latin-1 in DESCRIPTION.
The package checks and compiles fine on Windows, Linux and, ironically, my MacOS X main development machine. I realize using non- ASCII characters in source files is not a good idea and I removed them, but I would appreciate any clue as to what went wrong with the compilation on CRAN.
I assume that the MacOS X builds are done in a C locale?
Yes - but isn't this very similar to the problem we have been talking about a while back? The check analyses were reporting an error although the code was fine (I think it boiled down to text connection I/O in the check scripts failing mysteriously due to the fact that it was using the wrong encoding) I'll have to check later today ... Cheers, S
FWIW,
sessionInfo()
R version 2.6.2 (2008-02-08) i386-apple-darwin8.10.1
locale: fr_CA.UTF-8/fr_CA.UTF-8/fr_CA.UTF-8/C/fr_CA.UTF-8/fr_CA.UTF-8
attached base packages: [1] stats utils datasets grDevices graphics methods base
other attached packages: [1] CarbonEL_0.1-4
loaded via a namespace (and not attached): [1] tools_2.6.2
Thanks in advance!
--- Vincent Goulet, Associate Professor ?cole d'actuariat Universit? Laval, Qu?bec Vincent.Goulet at act.ulaval.ca http://vgoulet.act.ulaval.ca
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Have you set R_ENCODING_LOCALES? That's how you tell R what locale to use for latin1 and UTF-8 when checking. Details in R-exts.texi. As it works for me in 'C' on Leopard with R-devel without setting this, I can't reproduce the problem to check if setting works. For l10n_info, it is asking the nl_langinfo system. Looks like Darwin is using unusual charset names: it reports ISO8859-1 and we are looking for (the more correct) ISO-8859-1: I've 'hot fixed' that.
On Thu, 14 Feb 2008, Simon Urbanek wrote:
I think I found the cause, but fixing it may be more complicated (other than a hot fix for this particular case). What it boils down to is that the code for .check_package_code_syntax is trying to change the locale in a manner that doesn't work. In addition to that, the output of l10n_info() is wrong (for some definition of wrong), which complicates things even further. To top it all, if run in a UTF-8 locale, everything is just fine - that's why the package will pass check on "regular" OS X, because UTF-8 locale is the default since Leopard. .check_package_code_syntax() sees that the source requires Latin1, so it is checking whether the locale is utf-8, but it's not (because we force C) so it uses en_US. This may be the first problem, because en_US is not necessarily a latin1 locale at all (en_US.ISO8859-1 would be latin1 on OS X). However, the next problem is that l10n_info() is returning FALSE even for the (correct) latin1 locale and consequently(?) the reading fails. ginaz:~$ echo 'Sys.getlocale(); l10n_info()'|LANG=en_US.ISO8859-1 R -- vanilla --slave [1] "en_US.ISO8859-1/en_US.ISO8859-1/en_US.ISO8859-1/C/en_US.ISO8859-1/ en_US.ISO8859-1" $MBCS [1] FALSE $`UTF-8` [1] FALSE $`Latin-1` [1] FALSE en_US.ISO8859-1 *is* a latin-1 locale ... I was looking hard and found no way how to link (installed) locales to encodings - there is no official mapping and POSIX allows arbitrary locales (and names) .. Hence all locale names are merely loose conventions... so I'm not sure how can R even make such a decision (other than parse the name?). Anyway - a quick fix would be to force en_US.UTF-8 locale in that check for Mac OS X, but I think that doesn't fix the underlying problems ... Cheers, Simon On Feb 14, 2008, at 3:09 PM, Simon Urbanek wrote:
On Feb 14, 2008, at 2:45 PM, Kurt Hornik wrote:
Vincent Goulet writes:
Dear developeRs, Compilation of the latest version (0.9-5) of my actuar package fails with r-release MacOS_X ix86 on CRAN; see
All errors come from accented letters in comments in latin-1 encoded files (except hierarc.R which is in UTF-8, my bad). Encoding is declared as latin-1 in DESCRIPTION.
The package checks and compiles fine on Windows, Linux and, ironically, my MacOS X main development machine. I realize using non- ASCII characters in source files is not a good idea and I removed them, but I would appreciate any clue as to what went wrong with the compilation on CRAN.
I assume that the MacOS X builds are done in a C locale?
Yes - but isn't this very similar to the problem we have been talking about a while back? The check analyses were reporting an error although the code was fine (I think it boiled down to text connection I/O in the check scripts failing mysteriously due to the fact that it was using the wrong encoding) I'll have to check later today ... Cheers, S
FWIW,
sessionInfo()
R version 2.6.2 (2008-02-08) i386-apple-darwin8.10.1
locale: fr_CA.UTF-8/fr_CA.UTF-8/fr_CA.UTF-8/C/fr_CA.UTF-8/fr_CA.UTF-8
attached base packages: [1] stats utils datasets grDevices graphics methods base
other attached packages: [1] CarbonEL_0.1-4
loaded via a namespace (and not attached): [1] tools_2.6.2
Thanks in advance!
--- Vincent Goulet, Associate Professor ?cole d'actuariat Universit? Laval, Qu?bec Vincent.Goulet at act.ulaval.ca http://vgoulet.act.ulaval.ca
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595