On Jun 8, 2018, at 11:52 AM, Hadley Wickham <h.wickham at gmail.com> wrote:
On Fri, Jun 8, 2018 at 11:38 AM, Berry, Charles <ccberry at ucsd.edu> wrote:
On Jun 8, 2018, at 10:37 AM, Herv? Pag?s <hpages at fredhutch.org> wrote:
Also the TRUEs cause problems if some dimensions are 0:
matrix(raw(0), nrow=5, ncol=0)[1:3 , TRUE]
Error in matrix(raw(0), nrow = 5, ncol = 0)[1:3, TRUE] :
(subscript) logical subscript too long
OK. But this is easy enough to handle.
H.
On 06/08/2018 10:29 AM, Hadley Wickham wrote:
I suspect this will have suboptimal performance since the TRUEs will
get recycled. (Maybe there is, or could be, ALTREP, support for
recycling)
Hadley
AFAICS, it is not an issue. Taking
arr <- array(rnorm(2^22),c(2^10,4,4,4))
as a test case
and using a function that will either use the literal code `x[i,,,,drop=FALSE]' or `eval(mc)':
subset_ROW4 <-
function(x, i, useLiteral=FALSE)
{
literal <- quote(x[i,,,,drop=FALSE])
mc <- quote(x[i])
nd <- max(1L, length(dim(x)))
mc[seq(4,length=nd-1L)] <- rep(TRUE, nd-1L)
mc[["drop"]] <- FALSE
if (useLiteral)
eval(literal)
else
eval(mc)
}
I get identical times with
system.time(for (i in 1:10000) subset_ROW4(arr,seq(1,length=10,by=100),TRUE))
and with
system.time(for (i in 1:10000) subset_ROW4(arr,seq(1,length=10,by=100),FALSE))
I think that's because you used a relatively low precision timing
mechnaism, and included the index generation in the timing. I see:
arr <- array(rnorm(2^22),c(2^10,4,4,4))
i <- seq(1,length = 10, by = 100)
bench::mark(
arr[i, TRUE, TRUE, TRUE],
arr[i, , , ]
)
#> # A tibble: 2 x 1
#> expression min mean median max n_gc
#> <chr> <bch:t> <bch:t> <bch:tm> <bch:tm> <dbl>
#> 1 arr[i, TRUE,? 7.4?s 10.9?s 10.66?s 1.22ms 2
#> 2 arr[i, , , ] 7.06?s 8.8?s 7.85?s 538.09?s 2
So not a huge difference, but it's there.