Displaying 20 results from an estimated 1000 matches similar to: "Benchmark code, but avoid printing"
2015 Jan 02
0
Benchmark code, but avoid printing
On Jan 2, 2015, at 12:02 PM, G?bor Cs?rdi <csardi.gabor at gmail.com> wrote:
> Dear all,
>
> I am trying to benchmark code that occasionally prints on the screen
> and I want to
> suppress the printing. Is there an idiom for this?
>
> If I do
>
> sink(tempfile)
> microbenchmark(...)
> sink()
>
> then I'll be also measuring the costs of writing
2017 Aug 22
4
How to benchmark speed of load/readRDS correctly
Dear all
I was thinking about efficient reading data into R and tried several ways to test if load(file.Rdata) or readRDS(file.rds) is faster. The files file.Rdata and file.rds contain the same data, the first created with save(d, ' file.Rdata', compress=F) and the second with saveRDS(d, ' file.rds', compress=F).
First I used the function microbenchmark() and was a astonished
2016 Oct 27
4
Encontrar la primera columna no NA
Por último, utilizando la indexación lineal de matriz que propusó luisfo en su momento:
> t <- Sys.time()
> M=as.matrix(dat)
> index <- which(!is.na(M)) - 1
> meses<-colnames(M)
> M2<- data.table(columna=index %/% nrow(M) +1L, jugador=index %% nrow(M) +1L , valor=M[index+1L])
> setkey(M2,jugador,columna)
>
2013 Jul 02
2
cache most-recent dispatch
Hi,
S4 method dispatch can be very slow. Would it be reasonable to cache the
most
recent dispatch, anticipating the next invocation will be on the same
type? This
would be very helpful in loops.
fun0 <- function(x)
sapply(x, paste, collapse="+")
fun1 <- function(x) {
paste <- selectMethod(paste, class(x[[1]]))
sapply(x, paste,
2016 Oct 27
3
Encontrar la primera columna no NA
Imaginemos que tenemos una matriz con datos temporales por sujetos.
Pongamos que numero de veces que ha jugado una carta en un juego online. Y
que quiero saber cuantas veces jugo la carta el primer mes que estuvo en el
juego.
Pero claro mi matriz guarda los datos temporalmente de tal manera que:
# data.table( Enero = c( 1, 4, NA , NA , NA) , Febrero = c( 2, 6, 1, NA, NA
) , Marzo = c( 8,6,7,3,
2016 Oct 27
2
Encontrar la primera columna no NA
Otra solución algo más rapida:
> t <- Sys.time()
> dat[,jugador:=1:.N]
> dat2=melt(dat,id.vars="jugador")
> setkey(dat2,jugador)
> dat2[,index:=min(which(!is.na(value))),by=jugador]
> dat2[,.(First_month=variable[index[1]],Value_First_month=value[index[1]]),by=jugador]
jugador First_month Value_First_month
1: 1 Uno 0.93520715
2:
2015 Jan 26
2
speedbump in library
>>>>> Winston Chang <winstonchang1 at gmail.com>
>>>>> on Fri, 23 Jan 2015 10:15:53 -0600 writes:
> I think you can simplify a little by replacing this:
> pkg %in% loadedNamespaces()
> with this:
> .getNamespace(pkg)
almost: It would be
!is.null(.getNamespace(pkg))
> Whereas getNamespace(pkg) will load the
2018 Feb 11
4
Parallel assignments and goto
Hi guys,
I am working on some code for automatically translating recursive functions into looping functions to implemented tail-recursion optimisations. See https://github.com/mailund/tailr
As a toy-example, consider the factorial function
factorial <- function(n, acc = 1) {
if (n <= 1) acc
else factorial(n - 1, acc * n)
}
I can automatically translate this into the loop-version
2017 Aug 04
2
Why is as.function() slower than eval(call("function"())?
(Apologies if this is better suited for R-help.)
On my system (macOS Sierra, late 2014 MacBook Pro; R 3.4.1, Homebrew build), I found that it is faster to construct a function using eval(call("function", ...)) than using as.function(list(...)). Example:
make_fn_1 <- function(a, b) eval(call("function", a, b), env = parent.frame())
make_fn_2 <- function(a, b)
2018 Mar 13
4
Possible Improvement to sapply
While working with sapply, the documentation states that the simplify argument will yield a vector, matrix etc "when possible". I was curious how the code actually defined "as possible" and see this within the function
if (!identical(simplify, FALSE) && length(answer))
This seems superfluous to me, in particular this part:
!identical(simplify, FALSE)
The preceding
2024 Feb 29
2
[External] converting MATLAB -> R | element-wise operation
I decided to do a direct comparison of transpose and sweep.
library(microbenchmark)
NN <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, byrow = TRUE) # Example matrix
lambda <- c(2, 3, 4) # Example vector
colNN <- t(NN)
microbenchmark(
sweep = sweep(NN, 2, lambda, "/"),
transpose = t(t(NN)/lambda),
colNN = colNN/lambda
)
Unit: nanoseconds
expr min lq
2017 Aug 22
0
How to benchmark speed of load/readRDS correctly
The large value for maximum time may be due to garbage collection, which
happens periodically. E.g., try the following, where the
unlist(as.list()) creates a lot of garbage. I get a very large time every
102 or 51 iterations and a moderately large time more often
mb <- microbenchmark::microbenchmark({ x <- as.list(sin(1:5e5)); x <-
unlist(x) / cos(1:5e5) ; sum(x) }, times=1000)
2018 Feb 27
2
Parallel assignments and goto
Interestingly, the <<- operator is also a lot faster than using a namespace explicitly, and only slightly slower than using <- with local variables, see below. But, surely, both must at some point insert values in a given environment ? either the local one, for <-, or an enclosing one, for <<- ? so I guess I am asking if there is a more low-level assignment operation I can get my
2018 Mar 13
2
Possible Improvement to sapply
FYI, in R devel (to become 3.5.0), there's isFALSE() which will cut
some corners compared to identical():
> microbenchmark::microbenchmark(identical(FALSE, FALSE), isFALSE(FALSE))
Unit: nanoseconds
expr min lq mean median uq max neval
identical(FALSE, FALSE) 984 1138 1694.13 1218.0 1337.5 13584 100
isFALSE(FALSE) 713 761 1133.53 809.5 871.5
2020 Nov 01
2
parallel PSOCK connection latency is greater on Linux?
I'm exploring latency overhead of parallel PSOCK workers and noticed
that serializing/unserializing data back to the main R session is
significantly slower on Linux than it is on Windows/MacOS with similar
hardware. Is there a reason for this difference and is there a way to
avoid the apparent additional Linux overhead?
I attempted to isolate the behavior with a test that simply returns
2017 Nov 20
2
Small performance bug in [.Date
Hi all,
I think there's an unnecessary line in [.Date which has a considerable
impact on performance when subsetting large dates:
x <- Sys.Date() + 1:1e6
microbenchmark::microbenchmark(x[1])
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> x[1] 920.651 1039.346 3624.833 2294.404 3786.881 41176.38 100
`[.Date` <- function(x, ...,
2018 Mar 13
1
Possible Improvement to sapply
Could your code use vapply instead of sapply? vapply forces you to declare
the type and dimensions
of FUN's output and stops if any call to FUN does not match the
declaration. It can use much less
memory and time than sapply because it fills in the output array as it goes
instead of calling lapply()
and seeing how it could be simplified.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Tue,
2015 Mar 05
3
Performance issue in stats:::weighted.mean.default method
Hi,
I'm using this mailing list for the first time and I hope this is the
right one. I don't think that the following is a bug but it can be a
performance issue.
By my opinion, there is no need to filter by [w != 0] in last sum of
weighted.mean.default method defined in
src/library/stats/R/weighted.mean.R. There is no need to do it because
you can always sum zero numbers and
2017 Aug 22
1
How to benchmark speed of load/readRDS correctly
Note that if you force a garbage collection each iteration the times are
more stable. However, on the average it is faster to let the garbage
collector decide when to leap into action.
mb_gc <- microbenchmark::microbenchmark(gc(), { x <- as.list(sin(1:5e5)); x
<- unlist(x) / cos(1:5e5) ; sum(x) }, times=1000,
control=list(order="inorder"))
with(mb_gc,
2018 Feb 26
0
Parallel assignments and goto
Following up on this attempt of implementing the tail-recursion optimisation ? now that I?ve finally had the chance to look at it again ? I find that non-local return implemented with callCC doesn?t actually incur much overhead once I do it more sensibly. I haven?t found a good way to handle parallel assignments that isn?t vastly slower than simply introducing extra variables, so I am going with