Minor bug with stats::isoreg

Thu, Sep 28, 2023 1:53 AM

> ? Wed, 27 Sep 2023 13:49:58 -0700 Travers Ching
    > <traversc at gmail.com> ?????:

    >> Calling isoreg with an Inf value causes a segmentation
    >> fault, tested on R 4.3.1 and R 4.2. A reproducible
    >> example is: `isoreg(c(0,Inf))`

    > Indeed, the code in src/library/stats/src/isoreg.c
    > contains the following loop:

The above would not even be sufficient: 
It's the sum(y) really, because internally
  yc <- cumsum(c(0,y)) and actually  diff(yc)  is used
where you get to  Inf - Inf ==> NaN

*** caught segfault ***
address 0x7e48000, cause 'memory not mapped'
/u/maechler/bin/R_arg: Zeile 160: 873336 Speicherzugriffsfehler  (Speicherabzug geschrieben) $exe $@

Also, the C code still does not work for long vectors,
so I want to change the C code anyway.


In any case:
  Thank you, Travers, Ben, and Ivan, for reporting and addressing
  the issue!


------

There is an interesting point here though:

For dealing with +/- Inf, we used to follow the following idea
in R quite keenly (and sometimes extremely): 

If 'Inf' leads a computation to "fail" (NB:  1/Inf  |-->  0 does *not* fail)
try to see what the mathematical *or* computational limit
 x --> Inf would be.
If that is easily defined, we use that.

So, often as a first step, look at what happens if you replace
Inf by 1e100  (and then also what happens if you are finite but
*close* to Inf, i.e. the 7e308 above).

Now here, at least in some cases, such a limit cases are clearly
detectable, e.g., when you let  y[2] --->  -Inf here

[,1]         [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]   4.5000e+00   4.5000e+00 4.50 4.50    5    6    6    6    6     8
 [2,]   4.2500e+00   4.2500e+00 4.25 4.25    5    6    6    6    6     8
 [3,]   4.0000e+00   4.0000e+00 4.00 4.00    5    6    6    6    6     8
 [4,]   3.7500e+00   3.7500e+00 3.75 3.75    5    6    6    6    6     8
 [5,]   3.5000e+00   3.5000e+00 3.50 3.50    5    6    6    6    6     8
 [6,]   3.2500e+00   3.2500e+00 3.25 3.25    5    6    6    6    6     8
 [7,]   3.0000e+00   3.0000e+00 3.00 3.00    5    6    6    6    6     8
 [8,]   2.7500e+00   2.7500e+00 2.75 2.75    5    6    6    6    6     8
 [9,]   2.5000e+00   2.5000e+00 2.50 2.50    5    6    6    6    6     8
[10,]   2.2500e+00   2.2500e+00 2.25 2.25    5    6    6    6    6     8
[11,]   2.0000e+00   2.0000e+00 2.00 2.00    5    6    6    6    6     8
[12,]  -2.5000e+00  -2.5000e+00 1.00 2.00    5    6    6    6    6     8
[13,]  -7.5000e+00  -7.5000e+00 1.00 2.00    5    6    6    6    6     8
[14,]  -4.9750e+02  -4.9750e+02 1.00 2.00    5    6    6    6    6     8
[15,]  -4.9975e+03  -4.9975e+03 1.00 2.00    5    6    6    6    6     8
[16,]  -5.0000e+09  -5.0000e+09 1.00 2.00    5    6    6    6    6     8
[17,]  -5.0000e+99  -5.0000e+99 0.00 0.00    0    0    0    0    0     0
[18,] -5.0000e+199 -5.0000e+199 0.00 0.00    0    0    0    0    0     0
[19,] -5.0000e+299 -5.0000e+299 0.00 0.00    0    0    0    0    0     0

so one could say that ideally,

      isoreg(c(5, -Inf, 1:2, 5:8, 3, 8))

should produce fitted values

             c(-Inf, -Inf, 0, 0, ..., 0)

and if someone has a +/-  elegant implementation
we could again allow  +/-Inf entries in  isoreg(), at least when
the Inf's have all the same sign.

Martin

Minor bug with stats::isoreg

Thread (4 messages)