Skip to content

Operations with long altrep vectors cause segfaults on Windows

20 messages · Martin Maechler, iuke-tier@ey m@iii@g oii uiow@@edu, Jeroen Ooms +2 more

#
I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2):

$> R --vanilla
x <- c(0L, -2e9:2e9)

# > Segmentation fault

Tried to reproduce on Linux but the above worked as expected. Not an
issue merely with the length of the vector; for example, x <-
rep_len(1:10, 1e10) works, though the altrep vector must be long to
reproduce:

x <- c(0L, -1e9:1e9)  #ok

Segmentation faults occur with the following too:

x <- (-2e9:2e9) + 1L
#
> I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2):

    > $> R --vanilla
    > x <- c(0L, -2e9:2e9)

    > # > Segmentation fault

    > Tried to reproduce on Linux but the above worked as expected. Not an
    > issue merely with the length of the vector; for example, x <-
    > rep_len(1:10, 1e10) works, though the altrep vector must be long to
    > reproduce:

    > x <- c(0L, -1e9:1e9)  #ok

    > Segmentation faults occur with the following too:

    > x <- (-2e9:2e9) + 1L

Your operation would "need" (not in theory, but in practice)
to go from altrep to regular vectors.
I guess the segfault occurs because of something like this :

 R asks Windows to hand it a huge amount of memory and Windows replies
 "ok, here is the memory pointer"
 and then R tries to write to there, but illegally (because
 Windows should have told R that it does not really have enough
 memory for that ..). 
 
I cannot reproduce the segmentation fault .. but I can confirm
there is a bug there that shows for me on Windows but not on
Linux:

"My" Windows is on a terminalserver not with too many GB of memory
(but then in a version of Windows that recognizes that it cannot
 get so much memory):

------------------------- Here some transcript (thanks to
                          using Emacs w/ ESS also on Windows) ------------------

R Under development (unstable) (2020-08-24 r79074) -- "Unsuffered Consequences"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R ist freie Software und kommt OHNE JEGLICHE GARANTIE.
Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten.
Tippen Sie 'license()' or 'licence()' f?r Details dazu.

R ist ein Gemeinschaftsprojekt mit vielen Beitragenden.
Tippen Sie 'contributors()' f?r mehr Information und 'citation()',
um zu erfahren, wie R oder R packages in Publikationen zitiert werden k?nnen.

Tippen Sie 'demo()' f?r einige Demos, 'help()' f?r on-line Hilfe, oder
'help.start()' f?r eine HTML Browserschnittstelle zur Hilfe.
Tippen Sie 'q()', um R zu verlassen.
Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren
Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren
Error: cannot allocate vector of size 14.9 Gb
@0x00000000195a6808 14 REALSXP g0c0 [REF(65535)]  -1000000000 : -294967296 (compact)
[1] 2.147484
@0x000000000a11a5d8 14 REALSXP g0c0 [REF(65535)]  -1000000 : -2094967296 (compact)
@0x000000000a13adf0 13 INTSXP g0c0 [REF(65535)]  -1000000 : 2000000000 (compact)
------------------------- end of transcript -----------------------------------

So indeed, no seg.fault, R notices that it can't get 15 GB of
memory.

But the bug is bad news:  We have *silent* integer overflow happening
according to what  .Internal(inspect(y)) shows...

 .... less bad new: Probably the bug is only in the 'internal inspect' code
 where a format specifier is used in C's printf() that does not work
 correctly on Windows, at least the way it is currently compiled ..


On (64-bit) Linux, I get
@7d86388 14 REALSXP g0c0 [REF(65535)]  -1000000000 : 4000000000 (compact)
Error: cannot allocate vector of size 37.3 Gb

which seems much better ... until I do find a bug, may again
only in the C code underlying .Internal(inspect(.)) :
@7d86ac0 13 INTSXP g0c0 [REF(65535)] Error: long vectors not supported yet: ../../../R/src/main/altclasses.c:139
#
>> I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2):

    >> $> R --vanilla
    >> x <- c(0L, -2e9:2e9)

    >> # > Segmentation fault

    >> Tried to reproduce on Linux but the above worked as expected. Not an
    >> issue merely with the length of the vector; for example, x <-
    >> rep_len(1:10, 1e10) works, though the altrep vector must be long to
    >> reproduce:

    >> x <- c(0L, -1e9:1e9)  #ok

    >> Segmentation faults occur with the following too:

    >> x <- (-2e9:2e9) + 1L

    > Your operation would "need" (not in theory, but in practice)
    > to go from altrep to regular vectors.
    > I guess the segfault occurs because of something like this :

    > R asks Windows to hand it a huge amount of memory and Windows replies
    > "ok, here is the memory pointer"
    > and then R tries to write to there, but illegally (because
    > Windows should have told R that it does not really have enough
    > memory for that ..). 
 
    > I cannot reproduce the segmentation fault .. but I can confirm
    > there is a bug there that shows for me on Windows but not on
    > Linux:

    > "My" Windows is on a terminalserver not with too many GB of memory
    > (but then in a version of Windows that recognizes that it cannot
    > get so much memory):

    > ------------------------- Here some transcript (thanks to
    > using Emacs w/ ESS also on Windows) ------------------

    > R Under development (unstable) (2020-08-24 r79074) -- "Unsuffered Consequences"
    > Copyright (C) 2020 The R Foundation for Statistical Computing
    > Platform: x86_64-w64-mingw32/x64 (64-bit)

    > R ist freie Software und kommt OHNE JEGLICHE GARANTIE.
    > Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten.
    > Tippen Sie 'license()' or 'licence()' f?r Details dazu.

    > R ist ein Gemeinschaftsprojekt mit vielen Beitragenden.
    > Tippen Sie 'contributors()' f?r mehr Information und 'citation()',
    > um zu erfahren, wie R oder R packages in Publikationen zitiert werden k?nnen.

    > Tippen Sie 'demo()' f?r einige Demos, 'help()' f?r on-line Hilfe, oder
    > 'help.start()' f?r eine HTML Browserschnittstelle zur Hilfe.
    > Tippen Sie 'q()', um R zu verlassen.

    >> x <- (-2e9:2e9) + 1L
    > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren
    >> y <- c(0L, -2e9:2e9)
    > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren
    >> Sys.setenv(LANGUAGE="en")
    >> y <- c(0L, -2e9:2e9)
    > Error: cannot allocate vector of size 14.9 Gb
    >> y <- -1e9:4e9
    >> .Internal(inspect(y))
    > @0x00000000195a6808 14 REALSXP g0c0 [REF(65535)]  -1000000000 : -294967296 (compact)
    >> .Machine$integer.max / 1e9
    > [1] 2.147484
    >> y <- -1e6:2.2e9
    >> .Internal(inspect(y))
    > @0x000000000a11a5d8 14 REALSXP g0c0 [REF(65535)]  -1000000 : -2094967296 (compact)
    >> y <- -1e6:2e9
    >> .Internal(inspect(y))
    > @0x000000000a13adf0 13 INTSXP g0c0 [REF(65535)]  -1000000 : 2000000000 (compact)
    >> 
    > ------------------------- end of transcript -----------------------------------

    > So indeed, no seg.fault, R notices that it can't get 15 GB of
    > memory.

    > But the bug is bad news:  We have *silent* integer overflow happening
    > according to what  .Internal(inspect(y)) shows...

    > .... less bad new: Probably the bug is only in the 'internal inspect' code
    > where a format specifier is used in C's printf() that does not work
    > correctly on Windows, at least the way it is currently compiled ..


    > On (64-bit) Linux, I get

    >> y <- -1e9:4e9 ; .Internal(inspect(y))
    > @7d86388 14 REALSXP g0c0 [REF(65535)]  -1000000000 : 4000000000 (compact)

    >> y <- c(0L, y)
    > Error: cannot allocate vector of size 37.3 Gb

    > which seems much better ... until I do find a bug, may again
    > only in the C code underlying .Internal(inspect(.)) :

    >> y <- -1e9:2e9 ; .Internal(inspect(y))
    > @7d86ac0 13 INTSXP g0c0 [REF(65535)] Error: long vectors not supported yet: ../../../R/src/main/altclasses.c:139
    >> 

Indeed, the purported "integer overflow" (above) does not
happen.
It is "only" a  'printf' related bug inside .Internal(inspect(.)) on Windows.

*interestingly*, the above bug I've noticed on (64-bit) Linux
does *not* show on Windows (64-bit), at least not for that case:

On Windows, things are fine as long as they remain (compacted
aka 'ALTREP') INTSXP:

  > y <- -1e3:2e9 ;.Internal(inspect(y))
  @0x000000000a285648 13 INTSXP g0c0 [REF(65535)]  -1000 : 2000000000 (compact)
  > y <- -1e3:2.1e9 ;.Internal(inspect(y))
  @0x0000000019925930 13 INTSXP g0c0 [REF(65535)]  -1000 : 2100000000 (compact)

and here, y is correct, just the printing from
.Internal(inspect(y)) is bugous (probably prints the double as an integer):

  > y <- -1e3:2.2e9 ; .Internal(inspect(y))
  @0x00000000195c0178 14 REALSXP g0c0 [REF(65535)]  -1000 : -2094967296 (compact)
  > length(y)
  [1] 2200001001
  > tail(y)
  [1] 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09
  > tail(y) - 2.2e9
  [1] -5 -4 -3 -2 -1  0
  >
#
Thanks Martin.  On further testing, it seems that the segmentation
fault can only occur when the amount of obtainable memory is
sufficiently high. On my machine (admittedly with other processes
running):

$ R --vanilla --max-mem-size=30G -e "x <- c(0L, -2e9:2e9)"
Segmentation fault

$ R --vanilla --max-mem-size=29G -e "x <- c(0L, -2e9:2e9)"
Error: cannot allocate vector of size 14.9 Gb
Execution halted
On Tue, 8 Sep 2020 at 18:52, Martin Maechler <maechler at stat.math.ethz.ch> wrote:
#
On Tue, 8 Sep 2020, Hugh Parsonage wrote:

            
Unfortunately I don't have access to a Windows machine with enough
memory to get to the point of failure. If you have rtools and gdb
installed can you run in gdb and see where the segfault is happening?

Best,

luke

  
    
#
On Tue, 8 Sep 2020, Martin Maechler wrote:

            
It's a '%ld' that probably needs to be '%lld' for Windows. Will fix
sometime soon.

Best,

luke

  
    
#
Unfortunately I only get

[Thread 21752.0x4aa8 exited with code 3221225477]
[Thread 21752.0x4514 exited with code 3221225477]
[Thread 21752.0x3f10 exited with code 3221225477]
[Inferior 1 (process 21752) exited with code 030000000005]

(I'm guessing I would need to build an instrumented version of R, or
can R be debugged using gdb with an off-the-shelf installation?)
On Wed, 9 Sep 2020 at 00:32, <luke-tierney at uiowa.edu> wrote:
#
On 9/8/20 4:48 PM, Hugh Parsonage wrote:
No, the default build lacks debug symbols. You need a build with debug 
symbols, and if you can reproduce in a build without compiler 
optimizations (-O0), the backtrace may be easier to interpret. Some bugs 
however "disappear" when optimizations are disabled. You can build R 
from source (and there may be debug builds provided by someone else 
(Jeroen?)).

Tomas
#

        
> On Tue, 8 Sep 2020, Martin Maechler wrote:
>>>>>>> Martin Maechler
    >>>>>>> on Tue, 8 Sep 2020 10:40:24 +0200 writes:
    >> 
    >>>>>>> Hugh Parsonage
    >>>>>>> on Tue, 8 Sep 2020 18:08:11 +1000 writes:
    >> 
    >> >> I can only reproduce on Windows, but reliably (both 4.0.0 and 4.0.2):
    >> 
    >> >> $> R --vanilla
    >> >> x <- c(0L, -2e9:2e9)
    >> 
    >> >> # > Segmentation fault
    >> 
    >> >> Tried to reproduce on Linux but the above worked as expected. Not an
    >> >> issue merely with the length of the vector; for example, x <-
    >> >> rep_len(1:10, 1e10) works, though the altrep vector must be long to
    >> >> reproduce:
    >> 
    >> >> x <- c(0L, -1e9:1e9)  #ok
    >> 
    >> >> Segmentation faults occur with the following too:
    >> 
    >> >> x <- (-2e9:2e9) + 1L
    >> 
    >> > Your operation would "need" (not in theory, but in practice)
    >> > to go from altrep to regular vectors.
    >> > I guess the segfault occurs because of something like this :
    >> 
    >> > R asks Windows to hand it a huge amount of memory and Windows replies
    >> > "ok, here is the memory pointer"
    >> > and then R tries to write to there, but illegally (because
    >> > Windows should have told R that it does not really have enough
    >> > memory for that ..).
    >> 
    >> > I cannot reproduce the segmentation fault .. but I can confirm
    >> > there is a bug there that shows for me on Windows but not on
    >> > Linux:
    >> 
    >> > "My" Windows is on a terminalserver not with too many GB of memory
    >> > (but then in a version of Windows that recognizes that it cannot
    >> > get so much memory):
    >> 
    >> > ------------------------- Here some transcript (thanks to
    >> > using Emacs w/ ESS also on Windows) ------------------
    >> 
    >> > R Under development (unstable) (2020-08-24 r79074) -- "Unsuffered Consequences"
    >> > Copyright (C) 2020 The R Foundation for Statistical Computing
    >> > Platform: x86_64-w64-mingw32/x64 (64-bit)
    >> 
    >> > R ist freie Software und kommt OHNE JEGLICHE GARANTIE.
    >> > Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu verbreiten.
    >> > Tippen Sie 'license()' or 'licence()' f?r Details dazu.
    >> 
    >> > R ist ein Gemeinschaftsprojekt mit vielen Beitragenden.
    >> > Tippen Sie 'contributors()' f?r mehr Information und 'citation()',
    >> > um zu erfahren, wie R oder R packages in Publikationen zitiert werden k?nnen.
    >> 
    >> > Tippen Sie 'demo()' f?r einige Demos, 'help()' f?r on-line Hilfe, oder
    >> > 'help.start()' f?r eine HTML Browserschnittstelle zur Hilfe.
    >> > Tippen Sie 'q()', um R zu verlassen.
    >> 
    >> >> x <- (-2e9:2e9) + 1L
    >> > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren
    >> >> y <- c(0L, -2e9:2e9)
    >> > Fehler: kann Vektor der Gr??e 14.9 GB nicht allozieren
    >> >> Sys.setenv(LANGUAGE="en")
    >> >> y <- c(0L, -2e9:2e9)
    >> > Error: cannot allocate vector of size 14.9 Gb
    >> >> y <- -1e9:4e9
    >> >> .Internal(inspect(y))
    >> > @0x00000000195a6808 14 REALSXP g0c0 [REF(65535)]  -1000000000 : -294967296 (compact)
    >> >> .Machine$integer.max / 1e9
    >> > [1] 2.147484
    >> >> y <- -1e6:2.2e9
    >> >> .Internal(inspect(y))
    >> > @0x000000000a11a5d8 14 REALSXP g0c0 [REF(65535)]  -1000000 : -2094967296 (compact)
    >> >> y <- -1e6:2e9
    >> >> .Internal(inspect(y))
    >> > @0x000000000a13adf0 13 INTSXP g0c0 [REF(65535)]  -1000000 : 2000000000 (compact)
    >> >>
    >> > ------------------------- end of transcript -----------------------------------
    >> 
    >> > So indeed, no seg.fault, R notices that it can't get 15 GB of
    >> > memory.
    >> 
    >> > But the bug is bad news:  We have *silent* integer overflow happening
    >> > according to what  .Internal(inspect(y)) shows...
    >> 
    >> > .... less bad new: Probably the bug is only in the 'internal inspect' code
    >> > where a format specifier is used in C's printf() that does not work
    >> > correctly on Windows, at least the way it is currently compiled ..
    >> 
    >> 
    >> > On (64-bit) Linux, I get
    >> 
    >> >> y <- -1e9:4e9 ; .Internal(inspect(y))
    >> > @7d86388 14 REALSXP g0c0 [REF(65535)]  -1000000000 : 4000000000 (compact)
    >> 
    >> >> y <- c(0L, y)
    >> > Error: cannot allocate vector of size 37.3 Gb
    >> 
    >> > which seems much better ... until I do find a bug, may again
    >> > only in the C code underlying .Internal(inspect(.)) :
    >> 
    >> >> y <- -1e9:2e9 ; .Internal(inspect(y))
    >> > @7d86ac0 13 INTSXP g0c0 [REF(65535)] Error: long vectors not supported yet: ../../../R/src/main/altclasses.c:139
    >> >>
    >> 
    >> Indeed, the purported "integer overflow" (above) does not
    >> happen.
    >> It is "only" a  'printf' related bug inside .Internal(inspect(.)) on Windows.
    >> 
    >> *interestingly*, the above bug I've noticed on (64-bit) Linux
    >> does *not* show on Windows (64-bit), at least not for that case:
    >> 
    >> On Windows, things are fine as long as they remain (compacted
    >> aka 'ALTREP') INTSXP:
    >> 
    >> > y <- -1e3:2e9 ;.Internal(inspect(y))
    >> @0x000000000a285648 13 INTSXP g0c0 [REF(65535)]  -1000 : 2000000000 (compact)
    >> > y <- -1e3:2.1e9 ;.Internal(inspect(y))
    >> @0x0000000019925930 13 INTSXP g0c0 [REF(65535)]  -1000 : 2100000000 (compact)
    >> 
    >> and here, y is correct, just the printing from
    >> .Internal(inspect(y)) is bugous (probably prints the double as an integer):

    > It's a '%ld' that probably needs to be '%lld' for Windows. Will fix
    > sometime soon.

    > Best,
    > luke

I had guessed at something like that .. but "interestingly" it
was quite different:

Our code use   int n = LENGTH(.);
and  the error message above was triggered there.

I've committed a fix to both R-devel and R-patched (and added a
regression test),
but I still wonder why the above error had not triggered on Windows...

Martin

    >> 
    >> > y <- -1e3:2.2e9 ; .Internal(inspect(y))
    >> @0x00000000195c0178 14 REALSXP g0c0 [REF(65535)]  -1000 : -2094967296 (compact)
    >> > length(y)
    >> [1] 2200001001
    >> > tail(y)
    >> [1] 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09 2.2e+09
    >> > tail(y) - 2.2e9
    >> [1] -5 -4 -3 -2 -1  0
    >> >
    >> 
    >> ______________________________________________
    >> R-devel at r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 

    > -- 
    > Luke Tierney
    > Ralph E. Wareham Professor of Mathematical Sciences
    > University of Iowa                  Phone:             319-335-3386
    > Department of Statistics and        Fax:               319-335-3017
    > Actuarial Science
    > 241 Schaeffer Hall                  email:   luke-tierney at uiowa.edu
    > Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
#
On Tue, 8 Sep 2020, Martin Maechler wrote:

            
It does for me without the fix, so no additional Windows quirk here at
least.

Best,

luke

  
    
#
On Tue, Sep 8, 2020 at 5:20 PM Tomas Kalibera <tomas.kalibera at gmail.com> wrote:
Debug builds for each revision are available from
https://r-devel.github.io . To download the installer you need to
click the github icon in the last column in the table. You need to be
signed in with a (free) Github account in order to download builds
(artifacts) from Github actions. It will show download links for both
the regular installer and installer with debug symbols.

In other news, the https://r-devel.github.io table also shows that the
fix that martin committed is segfaulting on 32-bit.
#
On Tue, Sep 8, 2020 at 11:44 PM Jeroen Ooms <jeroenooms at gmail.com> wrote:
Sorry that was inaccurate, it is not segfaulting at all, but the unit
test is raising an error on 32-bit.
#
I am unable to set break or use gdb with any success when I use that version.

On linux I would do R -d gdb but this gives "unknown option '-d' "
while gdb R.exe (in the same directory as the debug version) gives the
same output as before.

I'm happy to help but I appreciate this list might not be the best
place to get a tutorial on using gdb on Windows.
On Wed, 9 Sep 2020 at 07:47, Jeroen Ooms <jeroenooms at gmail.com> wrote:
#
On 9/9/20 8:48 AM, Hugh Parsonage wrote:
Essentially, the steps are: build with DEBUG=T (to have debug symbols), 
possibly updating EOPTS in MkRules.local to disable optimizations, then 
run gdb loading RGui, "set solib-search-path", run RGui from gdb. Then 
you can break to debugger from RGui menu, or just run the code that 
segfaults, and you get to gdb and can print the stacktrace, etc. You can 
find some information in rw-FAQ (R for Windows FAQ), but yes, it is 
harder than on Linux. We can take care of this report, but of course in 
the longer term it would help if more people could take their time to 
setup debugging and analyze bugs even on Windows.

Tomas
#
Thank you!

I get

Starting program: C:\R\R-devel-20200909\bin\x64\Rgui.exe
[New Thread 19940.0x638c]
[New Thread 19940.0x102c]
[New Thread 19940.0x329c]
[New Thread 19940.0x37dc]
warning: Invalid parameter passed to C runtime function.

Program received signal SIGSEGV, Segmentation fault.
0x000000006c72d206 in compact_intseq_Dataptr (x=0x12783350,
writeable=<optimized out>) at altclasses.c:169
169     altclasses.c: No such file or directory.
On Wed, 9 Sep 2020 at 17:03, Tomas Kalibera <tomas.kalibera at gmail.com> wrote:
#
On 9/9/20 9:30 AM, Hugh Parsonage wrote:
Thanks, would you know which svn version this is?

Tomas
#
R Under development (unstable) (2020-09-08 r79165)
On Wed, 9 Sep 2020 at 18:00, Tomas Kalibera <tomas.kalibera at gmail.com> wrote:
#
Thanks. Should be now fixed in 79169.
Tomas
On 9/9/20 10:32 AM, Hugh Parsonage wrote:
#
On 9/8/20 11:47 PM, Jeroen Ooms wrote:
Now fixed, the test needs to be run only on 64-bit builds where such 
long vectors/sequences are allowed.

Tomas
#
I can confirm the segmentation fault does not occur as of r79170.
On Wed, 9 Sep 2020 at 19:06, Tomas Kalibera <tomas.kalibera at gmail.com> wrote: