> Henrik Bengtsson:
>
> I'm looking for a way to get the length of an object 'x' as
given by
> base data type without dispatching on class.
The performance improvement you're looking for is implemented in the
latest version of pqR (pqR-2016-10-24, see pqR-project.org), along
with corresponding improvements in several other circumstances where
unclass(x) does not create a copy of x.
Here are some examples (starting with yours), using pqR's Rprofmemt
function to get convenient traces of memory allocations:
> Rprofmemt(nelem=1000) # trace allocations of vectors with >= 1000
elements
>
> x <- structure(double(1e6), class = c("foo",
"numeric"))
RPROFMEM: 8000040 (double 1000000):"double" "structure"
RPROFMEM: 8000040 (double 1000000):"structure"
> length.foo <- function(x) 1L
> length(x)
[1] 1
> length(unclass(x))
[1] 1000000
>
> `+.foo` <- function (e1, e2) (unclass(e1) + unclass(e2)) %% 100
> z <- x + x
RPROFMEM: 8000040 (double 1000000):"+.foo"
>
> `<.foo` <- function (e1, e2) any(unclass(e1)<unclass(e2))
> x<x
[1] FALSE
>
> y <- unclass(x)
RPROFMEM: 8000040 (double 1000000):
There is no large allocation with length(unclass(x)), and only the
obviously necessarily single allocation in +.foo (not two additional
allocations for unclass(e1) and unclass(e1). For <.foo, there is no
large allocation at all, because not only are allocations avoided for
unclass(e1) and unclass(e2), but 'any' also avoids an allocation for
the result of the comparison. Unfortunately, assigning unclass(x) to
a variable does result in a copy being made (this might often be
avoided in future).
These performance improvements are implemented using pqR's "variant
result" mechanism, which also allows many other optimizations. See
https://radfordneal.wordpress.com/2013/06/30/how-pqr-makes-programs-faster-by-not-doing-things/
for some explanation. There is no particular reason this mechanism
couldn't be incorporated into R Core's implementation of R.
Radford Neal