Skip to content
Prev 53602 / 63424 Next

sum() returns NA on a long *logical* vector when nb of TRUE values exceeds 2^31

I second this feature request (it's understandable that this and
possibly other parts of the code was left behind / forgotten after the
introduction of long vector).

I think mean() avoids full copies, so in the meanwhile, you can work
around this limitation using:

countTRUE <- function(x, na.rm = FALSE) {
  nx <- length(x)
  if (nx < .Machine$integer.max) return(sum(x, na.rm = na.rm))
  nx * mean(x, na.rm = na.rm)
}

(not sure if one needs to worry about rounding errors, i.e. where n %% 0 != 0)

x <- rep(TRUE, times = .Machine$integer.max+1)
object.size(x)
## 8589934632 bytes

p <- profmem::profmem( n <- countTRUE(x) )
str(n)
## num 2.15e+09
print(n == .Machine$integer.max + 1)
## [1] TRUE

print(p)
## Rprofmem memory profiling of:
## n <- countTRUE(x)
##
## Memory allocations:
##      bytes calls
## total     0


FYI / related: I've just updated matrixStats::sum2() to support
logicals (develop branch) and I'll also try to update
matrixStats::count() to count beyond .Machine$integer.max.

/Henrik
On Fri, Jun 2, 2017 at 4:05 AM, Herv? Pag?s <hpages at fredhutch.org> wrote: