> On Jun 8, 2018, at 1:49 PM, Hadley Wickham <h.wickham at gmail.com> wrote: > > Hmmm, yes, there must be some special case in the C code to avoid > recycling a length-1 logical vector:Here is a version that (I think) handles Herve's issue of arrays having one or more 0 dimensions. subset_ROW <- function(x,i) { dims <- dim(x) index_list <- which(dims[-1] != 0L) + 3 mc <- quote(x[i]) nd <- max(1L, length(dims)) mc[ index_list ] <- list(TRUE) mc[[ nd + 3L ]] <- FALSE names( mc )[ nd+3L ] <- "drop" eval(mc) } Curiously enough the timing is *much* better for this implementation than for the first version I sent. Constructing a version of `mc' that looks like `x[i,,,,drop=FALSE]' can be done with `alist(a=)' in place of `list(TRUE)' in the earlier version but seems to slow things down noticeably. It requires almost twice (!!) as much time as the version above. Best, Chuck
On Fri, Jun 8, 2018 at 2:09 PM, Berry, Charles <ccberry at ucsd.edu> wrote:> > >> On Jun 8, 2018, at 1:49 PM, Hadley Wickham <h.wickham at gmail.com> wrote: >> >> Hmmm, yes, there must be some special case in the C code to avoid >> recycling a length-1 logical vector: > > > Here is a version that (I think) handles Herve's issue of arrays having one or more 0 dimensions. > > subset_ROW <- > function(x,i) > { > dims <- dim(x) > index_list <- which(dims[-1] != 0L) + 3 > mc <- quote(x[i]) > nd <- max(1L, length(dims)) > mc[ index_list ] <- list(TRUE) > mc[[ nd + 3L ]] <- FALSE > names( mc )[ nd+3L ] <- "drop" > eval(mc) > } > > Curiously enough the timing is *much* better for this implementation than for the first version I sent. > > Constructing a version of `mc' that looks like `x[i,,,,drop=FALSE]' can be done with `alist(a=)' in place of `list(TRUE)' in the earlier version but seems to slow things down noticeably. It requires almost twice (!!) as much time as the version above.I think that's probably because alist() is a slow way to generate a missing symbol: bench::mark( alist(x = ), list(x = quote(expr = )), check = FALSE )[1:5] #> # A tibble: 2 x 5 #> expression min mean median max #> <chr> <bch:tm> <bch:tm> <bch:tm> <bch:tm> #> 1 alist(x = ) 2.8?s 3.54?s 3.29?s 34.9?s #> 2 list(x = quote(expr = )) 169ns 219.38ns 181ns 24.2?s (note the units) Hadley -- http://hadley.nz
> On Jun 8, 2018, at 2:15 PM, Hadley Wickham <h.wickham at gmail.com> wrote: > > On Fri, Jun 8, 2018 at 2:09 PM, Berry, Charles <ccberry at ucsd.edu> wrote: >> >> >>> On Jun 8, 2018, at 1:49 PM, Hadley Wickham <h.wickham at gmail.com> wrote: >>> >>> Hmmm, yes, there must be some special case in the C code to avoid >>> recycling a length-1 logical vector: >> >> >> Here is a version that (I think) handles Herve's issue of arrays having one or more 0 dimensions. >> >> subset_ROW <- >> function(x,i) >> { >> dims <- dim(x) >> index_list <- which(dims[-1] != 0L) + 3 >> mc <- quote(x[i]) >> nd <- max(1L, length(dims)) >> mc[ index_list ] <- list(TRUE) >> mc[[ nd + 3L ]] <- FALSE >> names( mc )[ nd+3L ] <- "drop" >> eval(mc) >> } >> >> Curiously enough the timing is *much* better for this implementation than for the first version I sent. >> >> Constructing a version of `mc' that looks like `x[i,,,,drop=FALSE]' can be done with `alist(a=)' in place of `list(TRUE)' in the earlier version but seems to slow things down noticeably. It requires almost twice (!!) as much time as the version above. > > I think that's probably because alist() is a slow way to generate a > missing symbol: > > bench::mark( > alist(x = ), > list(x = quote(expr = )), > check = FALSE > )[1:5] > #> # A tibble: 2 x 5 > #> expression min mean median max > #> <chr> <bch:tm> <bch:tm> <bch:tm> <bch:tm> > #> 1 alist(x = ) 2.8?s 3.54?s 3.29?s 34.9?s > #> 2 list(x = quote(expr = )) 169ns 219.38ns 181ns 24.2?s > > (note the units)Yes. That is good for about half the difference. And I guess the rest is getting rid of seq(). This seems a bit quicker than anything else and satisfies Herve's objections: subset_ROW <- function(x,i) { dims <- dim(x) nd <- length(dims) index_list <- if (nd > 1) 2L + 2L:nd else 0 mc <- quote(x[i]) mc[ index_list ] <- list(quote(expr=)) mc[[ "drop" ]] <- FALSE eval(mc) } Chuck
On 06/08/2018 02:15 PM, Hadley Wickham wrote:> On Fri, Jun 8, 2018 at 2:09 PM, Berry, Charles <ccberry at ucsd.edu> wrote: >> >> >>> On Jun 8, 2018, at 1:49 PM, Hadley Wickham <h.wickham at gmail.com> wrote: >>> >>> Hmmm, yes, there must be some special case in the C code to avoid >>> recycling a length-1 logical vector: >> >> >> Here is a version that (I think) handles Herve's issue of arrays having one or more 0 dimensions. >> >> subset_ROW <- >> function(x,i) >> { >> dims <- dim(x) >> index_list <- which(dims[-1] != 0L) + 3 >> mc <- quote(x[i]) >> nd <- max(1L, length(dims)) >> mc[ index_list ] <- list(TRUE) >> mc[[ nd + 3L ]] <- FALSE >> names( mc )[ nd+3L ] <- "drop" >> eval(mc) >> } >> >> Curiously enough the timing is *much* better for this implementation than for the first version I sent. >> >> Constructing a version of `mc' that looks like `x[i,,,,drop=FALSE]' can be done with `alist(a=)' in place of `list(TRUE)' in the earlier version but seems to slow things down noticeably. It requires almost twice (!!) as much time as the version above. > > I think that's probably because alist() is a slow way to generate a > missing symbol: > > bench::mark( > alist(x = ), > list(x = quote(expr = )), > check = FALSE > )[1:5] > #> # A tibble: 2 x 5 > #> expression min mean median max > #> <chr> <bch:tm> <bch:tm> <bch:tm> <bch:tm> > #> 1 alist(x = ) 2.8?s 3.54?s 3.29?s 34.9?s > #> 2 list(x = quote(expr = )) 169ns 219.38ns 181ns 24.2?s > > (note the units)That's a good one. Need to change this in S4Vectors::default_extractROWS() and other places. Thanks! H.> > Hadley > >-- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319