type.convert (PR#13646)
William Dunlap wrote:
I can reproduce the difference that Stefan saw, depending
on whether or not I start Rgui with the flags
--no-environ --no-Rconsole
I think it boils down to the isBlankString() function.
For the string "\247" it returns 1 when those flags are
not present and 0 when they are. isBlankString does use
some locale-specific functions:
Rboolean isBlankString(const char *s)
{
#ifdef SUPPORT_MBCS
if(mbcslocale) {
wchar_t wc; int used; mbstate_t mb_st;
mbs_init(&mb_st);
while( (used = Mbrtowc(&wc, s, MB_CUR_MAX, &mb_st)) ) {
if(!iswspace(wc)) return FALSE;
s += used;
}
} else
#endif
while (*s)
if (!isspace((int)*s++)) return FALSE;
return TRUE;
}
I was using R 2.8.1, downloaded precompiled from CRAN, on Windows
XP SP3. The outputs of sessionInfo() and Sys.getenv() are the same
in both sessions. 'Process Explorer' shows that the 2 sessions
have the same dll's opened.
Thanks for that analysis Bill! Stefan was in "German_Austria.1252" which I don't think is multibyte, so only the else-clause should be relevant, pointing the finger rather squarely at isspace(). Googling indicates that others have been caught out by signed/unsigned char issues there. Should this possibly rather read if (!isspace((unsigned int)*s++)) return FALSE; ??
sessionInfo()
R version 2.8.1 (2008-12-22)
i386-pc-mingw32
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
I did the test with a dll compiled from
#include <R.h>
#include <R_ext/Utils.h>
void test_isBlankString(char **s, int *res)
{
*res = isBlankString(*s) ;
}
and called by .C("test_isBlankString","\247",-1L)
I don't see the difference while running a version of 2.9.0(devel)
compiled locally on 11 March 2009 (from svn rev 48116).
Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com
-----Original Message----- From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On Behalf Of Peter Dalgaard Sent: Friday, April 10, 2009 2:03 AM To: Raberger, Stefan Cc: R-bugs at r-project.org; r-devel at stat.math.ethz.ch Subject: Re: [Rd] type.convert (PR#13646) Raberger, Stefan wrote:
Hi Peter, each of the four PCs actually has the same locale setting:
Sys.setlocale("LC_CTYPE")
[1] "German_Austria.1252" (all the other settings returned by invoking
Sys.getlocale() are identical as well).
Just to be sure (because it's displayed incorrectly in my
browser on the bugtracking page): the character inside the type.convert function ought to be a "section"-sign (HTML Code § or § , in R "\247", and not a dot "."). I saw it correctly. It's "\302\247" in UTF8 locales, which is of course the reason I suspected locale settings, but I can't seem to trigger the NA behaviour. I'm at a loss here, but some ideas: In the cases where it returns NA, what type is it? (I.e. storage.mode(type.convert(....))) What do you get from
> charToRaw("?")
[1] c2 a7 (a7, presumably, but better check). -p
-----Urspr?ngliche Nachricht----- Von: Peter Dalgaard [mailto:p.dalgaard at biostat.ku.dk] Gesendet: Donnerstag, 09. April 2009 19:26 An: Raberger, Stefan Cc: r-devel at stat.math.ethz.ch; R-bugs at r-project.org Betreff: Re: [Rd] type.convert (PR#13646) s.raberger at innovest.at wrote:
Full_Name: Stefan Raberger Version: 2.8.1 OS: Windows XP Submission from: (NULL) (213.185.163.242) Hi there, I recently noticed some strange behaviour of the command
"type.convert",
depending on the startup mode used. But there also seems
to be different
behaviour on different PCs (all running the same OS and
the same version of R).
On PC1: When I start R in SDI mode (RGui --no-save --no-restore
--no-site-file
--no-init-file --no-environ) and try to convert, the result is
type.convert("?")
[1] NA If I use MDI mode (RGui --no-save --no-restore
--no-site-file --no-init-file
--no-environ --no-Rconsole) instead, the result is
type.convert("?")
[1] ? Levels: ? On PC2 it's exactly the other way round (SDI: ?, MDI: NA),
on PC2 the result is
always NA, independent of the startup mode used, and on
PC4 it's always ?.
What's the result I should expect R to return, and why is
it different in so
many cases?
Which locale does R think it is in in the four cases?
(Sys.setlocale("LC_CTYPE"), I think).
Might well not be a bug (so please don't file it as one).
Any help is much appreciated! Regards, Stefan
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
--
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph:
(+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX:
(+45) 35327907
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907