Subsetting the "ROW"s of an object
On 06/08/2018 02:15 PM, Hadley Wickham wrote:
On Fri, Jun 8, 2018 at 2:09 PM, Berry, Charles <ccberry at ucsd.edu> wrote:
On Jun 8, 2018, at 1:49 PM, Hadley Wickham <h.wickham at gmail.com> wrote: Hmmm, yes, there must be some special case in the C code to avoid recycling a length-1 logical vector:
Here is a version that (I think) handles Herve's issue of arrays having one or more 0 dimensions.
subset_ROW <-
function(x,i)
{
dims <- dim(x)
index_list <- which(dims[-1] != 0L) + 3
mc <- quote(x[i])
nd <- max(1L, length(dims))
mc[ index_list ] <- list(TRUE)
mc[[ nd + 3L ]] <- FALSE
names( mc )[ nd+3L ] <- "drop"
eval(mc)
}
Curiously enough the timing is *much* better for this implementation than for the first version I sent.
Constructing a version of `mc' that looks like `x[i,,,,drop=FALSE]' can be done with `alist(a=)' in place of `list(TRUE)' in the earlier version but seems to slow things down noticeably. It requires almost twice (!!) as much time as the version above.
I think that's probably because alist() is a slow way to generate a missing symbol: bench::mark( alist(x = ), list(x = quote(expr = )), check = FALSE )[1:5] #> # A tibble: 2 x 5 #> expression min mean median max #> <chr> <bch:tm> <bch:tm> <bch:tm> <bch:tm> #> 1 alist(x = ) 2.8?s 3.54?s 3.29?s 34.9?s #> 2 list(x = quote(expr = )) 169ns 219.38ns 181ns 24.2?s (note the units)
That's a good one. Need to change this in S4Vectors::default_extractROWS() and other places. Thanks! H.
Hadley
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319