thr3ads.net - similar to: "merge( , by='row.names') slowness"

Displaying 20 results from an estimated 20000 matches similar to: "merge( , by='row.names') slowness"

matrix row product and cumulative product

2008 Aug 18

matrix row product and cumulative product

I spent a lot of time searching and came up empty handed on the following query. Is there an equivalent to rowSums that does product or cumulative product and avoids use of apply or looping? I found a rowProd in a package but it was a convenience function for apply. As part of a likelihood calculation called from optim, I?m computing products and cumulative products of rows of matrices with

colnames slow (PR#10470)

2007 Nov 26

colnames slow (PR#10470)

Full_Name: Tomas Larsson Version: 2.6.0 OS: Windows XP Submission from: (NULL) (198.208.251.24) This is not a bug, it is a performance issue but I think it should have an easy fix. I have a large matrix (about 2,000,000 by 20), when I type colnames(x) it takes a long time to get the result. However, if I select just the first couple of rows of the matrix I don't have to wait for the

Filling out a data frame row by row.... slow!

2012 Feb 14

Filling out a data frame row by row.... slow!

I'm reading a file and using the file to populate a data frame. The way the file is laid out, I need to fill in the data frame one row at a time. When I start reading my file, I don't know how many rows I will need. It's on the order of a million. Being mindful of the time expense of reallocation, I decided on a strategy of doubling the data frame size every time I needed to expand

Row-by-row regression on matrix

2007 Sep 19

Row-by-row regression on matrix

Folks, I have a 3000 x 4 matrix (y), which I need to regress row-by-row against a 4-vector (x) to create a matrix lm.y of intercepts and slopes. To illustrate: y <- matrix(rnorm(12000), ncol = 4) x <- c(1/12, 3/12, 6/12, 1) system.time(lm.y <- t(apply(y, 1, function(z) lm(z ~ x)$coefficient))) [1] 44.72 18.00 69.52 NA NA Takes more than a minute to do (and I need to do many

Speed difference between df$a[1] and df[1,"a"]

2011 Oct 19

Speed difference between df$a[1] and df[1,"a"]

I was surprised to find that df$a[1] is an order of magnitude faster than df[1,"a"]: > df <- data.frame(a=1:10) > system.time(replicate(100000, df$a[3])) user system elapsed 0.36 0.00 0.36 > system.time(replicate(100000, df[3,"a"])) user system elapsed 4.09 0.00 4.09 A priori, I'd have thought that combining the row and column

Exceptional slowness with read.csv

2024 Apr 08

Exceptional slowness with read.csv

Greetings, I have a csv file of 76 fields and about 4 million records. I know that some of the records have errors - unmatched quotes, specifically.? Reading the file with readLines and parsing the lines with read.csv(text = ...) is really slow. I know that the first 2459465 records are good. So I try this: > startTime <- Sys.time() > first_records <- read.csv(file_name, nrows

(no subject)

2010 Jul 30

(no subject)

hello, i am new to R and trying to calculate the beta coefficient for standard linear regression for a series of randomly generated numbers. I have created this loop, but it runs really slow, is there a way to improve it? #number of simulations n.k<-999 #create the matrix for regression coefficients generated from #random data beta<-matrix(0,1,n.k+1) e<-matrix(0,tslength,n.k+1) for(k

write.table with row.names=FALSE unnecessarily slow?

2008 Mar 10

write.table with row.names=FALSE unnecessarily slow?

write.table with large data frames takes quite a long time > system.time({ + write.table(df, '/tmp/dftest.txt', row.names=FALSE) + }, gcFirst=TRUE) user system elapsed 97.302 1.532 98.837 A reason is because dimnames is always called, causing 'anonymous' row names to be created as character vectors. Avoiding this in src/library/utils, along the lines of Index:

paired wilcox test on each row of a large dataframe

2010 Feb 12

paired wilcox test on each row of a large dataframe

hI I have to calculate V statistic for each row of a large dataframe (28000). I can not use multtest package for paired wilcox test. I have been using for loop which are. Is there a way to speed the computation with another method like using apply or tapply? My data set looks like this: 11573_MB 11911_MB 11966_MB 12091_MB 12168_MB 12420_MB................ cg00000292

Exceptional slowness with read.csv

2024 Apr 08

Exceptional slowness with read.csv

No idea, but have you tried using ?scan to read those next 5 rows? It might give you a better idea of the pathologies that are causing problems. For example, an unmatched quote might result in some huge number of characters trying to be read into a single element of a character variable. As your previous respondent said, resolving such problems can be a challenge. Cheers, Bert On Mon, Apr 8,

Exceptional slowness with read.csv

2024 Apr 08

Exceptional slowness with read.csv

Hi Dave, That's rather frustrating. I've found vroom (from the package vroom) to be helpful with large files like this. Does the following give you any better luck? vroom(file_name, delim = ",", skip = 2459465, n_max = 5) Of course, when you know you've got errors & the files are big like that it can take a bit of work resolving things. The command line tools awk

slow load() in R2.6.0

2007 Oct 10

slow load() in R2.6.0

I'm encountering excruciatingly slow load times for character vectors in R 2.6.0-- up to 30sec for a 15K file that contains a no-attributes character vector of length ~1e4 and object size ~0.5MB. In R 2.5.1, repeated loads of the same set of files are near-instantaneous. The problem is proving tricky to reproduce consistently from scratch, so I have attached the 3 files used in the examples

Exceptional slowness with read.csv

2024 Apr 08

Exceptional slowness with read.csv

data.table's fread is also fast. Not sure about error handling. But I can merge 300 csvs with a total of 0.5m lines and 50 columns in a couple of minutes versus a lifetime with read.csv or readr::read_csv On Mon, 8 Apr 2024, 16:19 Stevie Pederson, <stephen.pederson.au at gmail.com> wrote: > Hi Dave, > > That's rather frustrating. I've found vroom (from the package

Exceptional slowness with read.csv

2024 Apr 10

Exceptional slowness with read.csv

?s 06:47 de 08/04/2024, Dave Dixon escreveu: > Greetings, > > I have a csv file of 76 fields and about 4 million records. I know that > some of the records have errors - unmatched quotes, specifically. > Reading the file with readLines and parsing the lines with read.csv(text > = ...) is really slow. I know that the first 2459465 records are good. > So I try this: >

slowness when I use a list comprehension

2024 Jun 16

slowness when I use a list comprehension

Dear RHelp-list, ?? I try to use the package comprehenr to replace a for loop by a list comprehension. ?I wrote the code but I certainly miss something because it is very slower compared to the for loops. May you please explain to me why the list comprehension is slower in my case. Here is my example. I do the calculation of the square difference between the values of two vectors vec1 and

extracting rows from a data frame by looping over the row names: performance issues

2007 Mar 02

extracting rows from a data frame by looping over the row names: performance issues

Hi, I have a big data frame: > mat <- matrix(rep(paste(letters, collapse=""), 5*300000), ncol=5) > dat <- as.data.frame(mat) and I need to do some computation on each row. Currently I'm doing this: > for (key in row.names(dat)) { row <- dat[key, ]; ... do some computation on row... } which could probably considered a very natural (and R'ish) way of

Exceptional slowness with read.csv

2024 Apr 08

Exceptional slowness with read.csv

I solved the mystery, but not the problem. The problem is that there's an unclosed quote somewhere in those 5 additional records I'm trying to access. So read.csv is reading million-character fields. It's slow at that. That mystery solved. However, the the problem persists: how to fix what is obvious to the naked eye - a quote not adjacent to a comma - but that read.csv can't

Lasso with Categorical Variables

2011 May 02

Lasso with Categorical Variables

Hi! This is my first time posting. I've read the general rules and guidelines, but please bear with me if I make some fatal error in posting. Anyway, I have a continuous response and 29 predictors made up of continuous variables and nominal and ordinal categorical variables. I'd like to do lasso on these, but I get an error. The way I am using "lars" doesn't allow for the

slowness when I use a list comprehension

2024 Jun 16

slowness when I use a list comprehension

This can be vectorized. Try ix <- seq_along(vec2) S_diff2 <- sapply(seq_len(N1-(N2-1)*ratio_sampling), \(j) sum((vec1[(ix-1)*ratio_sampling+j] - vec2[ix])**2)) On Sun, Jun 16, 2024 at 11:27?AM Laurent Rhelp <laurentRHelp at free.fr> wrote: > > Dear RHelp-list, > > I try to use the package comprehenr to replace a for loop by a list > comprehension. > > I

Exceptional slowness with read.csv

2024 Apr 10

Exceptional slowness with read.csv

That's basically what I did 1. Get text lines using readLines 2. use tryCatch to parse each line using read.csv(text=...) 3. in the catch, use?gregexpr to find any quotes not adjacent to a comma (gregexpr("[^,]\"[^,]",...) 4. escape any quotes found by adding a second quote (using str_sub from stringr) 6. parse the patched text using read.csv(text=...) 7. write out the parsed

similar to: merge( , by='row.names') slowness