Skip to content

Antwort: Re: R on Windows crashes when using certain (PR#14143)

2 messages · g.russell at eos-solutions.com, Brian Ripley

#
The new version of R-devel from yesterday morning seems to have fixed bug=20
14114! Thanks a lot for your help.

Duncan Murdoch <murdoch at stats.uwo.ca> schrieb am 14.12.2009 13:34:35:
and with
finished. Note
=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F
#
A few comments, though (I've been offline through much of this, 
and away from a Windows machine for almost all).

1) You could have narrowed down the cause by saving and restarting the 
session.  In particular it would have shown that the issue was not in 
sub() as you reported, since saving the object after the sub() call 
and starting a new session caused the problem in the second session.

2) Using gctorture() makes such things happen on much smaller problems 
and more reliably (if no faster).  (The underlying cause was more than 
one missing PROTECT.)

3) The difference between fixed=TRUE (which you should have used in 
the first place) and the extended and PCRE versions is often in 2.10.x 
in the encoding of the result: use Encoding() to find out.  Not only 
is fixed = TRUE much faster, it avoids repeated re-encodings.

4) Using UTF-8 encoded strings in a non-UTF-8 locale (and in 
particular on Windows) is a convenience but has performance 
implications.  Unless you need text not representable in the current 
locale, convert your strings to the current charset.  If you are using 
non-ASCII text and an 8-bit locale (e.g. CP1252 on Windows) then 
regexp computations will work somewhat faster in R-devel since they 
are performed in bytes (whereas 2.10.x uses wchar_t and for [g]sub 
returns the result in UTF-8).

5) These reports show yet again that people are not doing enough to 
help in the alpha/beta testing period of 2.x.0.  The R developers are 
almost exclusively using ASCII data or UTF-8 locales, so people doing 
extensive text processing in other locales please do take note of 
requests to test new versions of R.
On Tue, 15 Dec 2009, g.russell at eos-solutions.com wrote: