thr3ads.net - similar to: "extracting rows from a data frame by looping over the row names: performance issues"

Displaying 20 results from an estimated 9000 matches similar to: "extracting rows from a data frame by looping over the row names: performance issues"

Subset sum problem.

2009 Dec 07

Subset sum problem.

Hi, I'm quite new to the R-project. I was suggested to look into it because I am trying to solve the "Subset sum" problem", which basically is: Given a set of integers and an integer s, does any non-empty subset sum to s? (See http://en.wikipedia.org/wiki/Subset_sum_problem) I have been searching the web for quite some time now (which is how I eventually discovered that my

Parallel R

2008 Jun 28

Parallel R

Hello, The problem I'm working now requires to operate on big matrices. I've noticed that there are some packages that allows to run some commands in parallel. I've tried snow and NetWorkSpaces, without much success (they are far more slower that the normal functions) My problem is very simple, it doesn't require any communication between parallel tasks; only that it divides

Assertions for asynchronous behaviour

2011 Sep 13

Assertions for asynchronous behaviour

Hi all, In GOOS[1] they use an assertion called assertEventually which samples the system for a success state until a certain timeout has elapsed. This allows you to synchronise the tests with asynchronous code. Do we have an equivalent of that in the Ruby / RSpec world already? I know capybara has wait_until { } but that''s fairly rudimentary - the failure message isn''t very

help needed using t.test with factors

2010 Feb 04

help needed using t.test with factors

I am trying to use t.test on the following data: date type INTERVAL nCASES MTF SDF MTO SDO nFST MF nOBS MO MB BIASCV BIASEV ME MAE RMSE CRCF 2001-06-15 avn GE1.00 4385 0.246 0.300 1.502 0.556 1367 1.373 4385 1.502 1.471 0.285 0.164 -1.256 1.266 1.399 0.056 2001-06-15 avn

BUG?: On Linux setTimeLimit() fails to propagate timeout error when it occurs (works on Windows)

2016 Oct 26

BUG?: On Linux setTimeLimit() fails to propagate timeout error when it occurs (works on Windows)

setTimeLimit(elapsed=1) causes a timeout error whenever a call takes more than one second. For instance, this is how it works on Windows (R 3.3.1): > setTimeLimit(elapsed=1) > Sys.sleep(10); message("done") Error in Sys.sleep(10) : reached elapsed time limit Also, the error propagates immediately and causes an interrupt after ~1 second; > system.time({ Sys.sleep(10);

Idiomatic looping over list name, value pairs in R

2010 May 04

Idiomatic looping over list name, value pairs in R

Considering the python code: for k, v in d.items(): do_something(k); do_something_else(v) I have the following for R: for (i in c(1:length(d))) { do_something(names(d[i])); do_something_else(d[[i]]) } This does not seem seems idiomatic. What is the best way of doing the same with R? Thanks. Luis

socketSelect(..., timeout): non-integer timeouts in (0, 2) (?) equal infinite timeout on Linux - weird

2016 Oct 01

socketSelect(..., timeout): non-integer timeouts in (0, 2) (?) equal infinite timeout on Linux - weird

There's something weird going on for certain non-integer values of argument 'timeout' to base::socketSelect(). For such values, there is no timeout and you effectively end up with an infinite timeout. I can reproduce this on R 3.3.1 on Ubuntu 16.04 and RedHat 6.6, but not on Windows (via Linux Wine). # 1. In R master session > con <- socketConnection('localhost', port

tip: large plots

2011 Nov 18

tip: large plots

Hi all, I'm working with a bunch of large graphs, and stumbled across something useful. Probably many of you know this, but I didn't and so others might benefit. Using pch="." speeds up plotting considerably over using symbols. > x <- runif(1000000) > y <- runif(1000000) > system.time(plot(x, y, pch=".")) user system elapsed 1.042 0.030 1.077

socketSelect(..., timeout): non-integer timeouts in (0, 2) (?) equal infinite timeout on Linux - weird

2017 Oct 05

socketSelect(..., timeout): non-integer timeouts in (0, 2) (?) equal infinite timeout on Linux - weird

Fixed in 73470 Best, Tomas On 10/05/2017 06:11 AM, Henrik Bengtsson wrote: > I'd like to follow up/bump the attention to this bug causing the > timeout to fail for socketSelect() on Unix. It is still there in R > 3.4.2 and R-devel. I've identified the bug in the R source code - the > bug is due to floating-point precisions and comparison using >=. See > PR17203

In C, a fast way to slice a vector?

2009 May 10

In C, a fast way to slice a vector?

Hello, Suppose in the following code, PROTECT(sr = R_tryEval( .... )) sr is a RAWSXP vector. I wish to return another RAWSXP starting at position 13 onwards (base=0). I could create another RAWSXP of the correct length and then memcpy the required bytes and length to this new one. However is there a more efficient method? Regards Saptarshi Guha

Can't load XML_1.4-0.zip in last R devel

2007 Jan 06

Can't load XML_1.4-0.zip in last R devel

Hi, I can't load XML_1.4-0.zip in last R devel (Windows): R version 2.5.0 Under development (unstable) (2007-01-05 r40386) Copyright (C) 2007 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for

setTimeLimit sometimes fails to terminate idle call in R

2013 May 16

setTimeLimit sometimes fails to terminate idle call in R

I would like to use setTimeLimit to abort operations that are stuck waiting (idle) after n seconds. Below a toy example in which Sys.sleep is a placeholder call that is idle: testlimit <- function(){ setTimeLimit(elapsed=3, transient=TRUE); Sys.sleep(10); } system.time(testlimit()); However this is giving inconsistent results. On windows and in r-studio server (linux) the call is

(no subject)

2010 Jan 25

(no subject)

Hello -- I would like to know of a more efficient way of writing the following piece of code. Thanks. options(stringsAsFactors=FALSE) orig <- c(rep('11111111',100000),rep('22222222',200000),rep('33333333',300000),rep('44444444',400000)) orig.unique <- unique(orig) system.time(df <- as.data.frame(sapply(orig.unique, function(x) ifelse(orig==x, 1, 0))))

Huge performance difference between implicit and explicit print

2013 Oct 30

Huge performance difference between implicit and explicit print

Hi all, Can anyone help me understand why an implicit print (i.e. just typing df at the console), is so much slower than an explicit print (i.e. print(df)) in the example below? I see the difference in both Rstudio and in a terminal. # Construct large df as quickly as possible dummy <- 1:18e6 df <- lapply(1:10, function(x) dummy) names(df) <- letters[1:10] class(df) <-

serialize() to via temporary file is heaps faster than doing it directly (on Windows)

2008 Jul 25

serialize() to via temporary file is heaps faster than doing it directly (on Windows)

Hi, FYI, I just notice that on Windows (but not Linux) it is orders of magnitude (below it's 50x) faster to serialize() and object to a temporary file and then read it back, than to serialize to an object directly. This has for instance impact on how fast digest::digest() can provide a checksum. Example: x <- 1:1e7; t1 <- system.time(raw1 <- serialize(x, connection=NULL));

which is the fastest way to make data.frame out of a three-dimensional array?

2012 Feb 25

which is the fastest way to make data.frame out of a three-dimensional array?

foo <- rnorm(30*34*12) dim(foo) <- c(30, 34, 12) I want to make a data.frame out of this three-dimensional array. Each dimension will be a variabel (column) in the data.frame. I know how this can be done in a very slow way using for loops, like this: x <- rep(seq(from = 1, to = 30), 34) y <- as.vector(sapply(1:34, function(x) {rep(x, 30)})) month <- as.vector(sapply(1:12,

Systemfit

2018 May 15

Systemfit

... and the mailing list is picky about attachments... whatever you attached did not conform to the stringent requirements mentioned in the Posting Guide. Pasting the code right into the email is usually safest, though you DO have to post using plain text (as the Posting Guide indicates) or your code may get mangled by the automatic html format removal. On May 15, 2018 7:04:31 AM PDT, Bert Gunter

colnames for data.frame could be greatly improved

2016 Dec 20

colnames for data.frame could be greatly improved

Hello, colnames seems to be not optimized well for data.frame. It escapes processing for data.frame in if (is.data.frame(x) && do.NULL) return(names(x)) but only when do.NULL true. This makes huge difference when do.NULL false. Minimal edit to `colnames`: if (is.data.frame(x)) { nm <- names(x) if (do.NULL || !is.null(nm)) return(nm) else

finding a faster way to run lm on rows of predictor matrix

2011 Jul 29

finding a faster way to run lm on rows of predictor matrix

Hi, everyone. I need to run lm with the same response vector but with varying predictor vectors. (i.e. 1 response vector on each individual 6,000 predictor vectors) After looking through the R archive, I found roughly 3 methods that has been suggested. Unfortunately, I need to run this task multiple times(~ 5,000 times) and would like to find a faster way than the existing methods. All three

stats::convolve documentation enhancement

2013 Jun 23

stats::convolve documentation enhancement

Hi, the function stats::convolve does not mention efficient usage of the underlying FFT algorithm, such as (a) if type="circular", then length(x)=length(y) should have many factors (e.g. length(x) = length(y) = 2^n) (b) if type="open" or "filter", then length(x)+length(y)-1 should have many factors (e.g. length(x)+length(y)-1 = 2^n) In particular the latter may

similar to: extracting rows from a data frame by looping over the row names: performance issues