Martin Maechler <maechler at stat.math.ethz.ch>
on Thu, 1 Feb 2018 16:34:04 +0100 writes:
Herv? Pag?s <hpages at fredhutch.org>
on Tue, 30 Jan 2018 13:30:18 -0800 writes:
> Hi Martin, Henrik,
> Thanks for the follow up.
> @Martin: I vote for 2) without *any* hesitation :-)
> (and uniformity could be restored at some point in the
> future by having prod(), rowSums(), colSums(), and others
> align with the behavior of length() and sum())
As a matter of fact, I had procrastinated and worked at
implementing '2)' already a bit on the weekend and made it work
- more or less. It needs a bit more work, and I had also been considering
replacing the numbers in the current overflow check
if (ii++ > 1000) { \
ii = 0; \
if (s > 9000000000000000L || s < -9000000000000000L) { \
if(!updated) updated = TRUE; \
*value = NA_INTEGER; \
warningcall(call, _("integer overflow - use sum(as.numeric(.))")); \
return updated; \
} \
} \
i.e. think of tweaking the '1000' and '9000000000000000L',
but decided to leave these and add comments there about why. For
the moment.
They may look arbitrary, but are not at all: If you multiply
them (which looks correct, if we check the sum 's' only every 1000-th
time ...((still not sure they *are* correct))) you get 9*10^18
which is only slightly smaller than 2^63 - 1 which may be the
maximal "LONG_INT" integer we have.
So, in the end, at least for now, we do not quite go all they way
but overflow a bit earlier,... but do potentially gain a bit of
speed, notably with the ITERATE_BY_REGION(..) macros
(which I did not show above).
Will hopefully become available in R-devel real soon now.
Martin