thr3ads.net - similar to: "How to separate huge dataset into chunks"

Displaying 20 results from an estimated 10000 matches similar to: "How to separate huge dataset into chunks"

2009 Apr 17

Manipulate single line in textfile

Hello all, Is it possible to modify a single line in a textfile? I know it is possible to load the whole text file, do the change, and save this as a new file. However, this is not practical in my case, because the document is huge and cannot be fully loaded in R. Any idea? Best, Guillaume

Matrix starting at [0,0] instead of [1,1]?

2008 Oct 16

Matrix starting at [0,0] instead of [1,1]?

Hello all, When I create a matrix, is there a way to make it start at [0,0], instead of [1,1]? That way, a 2x2 matrix would go from [0,0] to [1,1], instead of [1,1] to [2,2]. Best, Guillaume

How do I add a variable to a text file?

2009 Mar 19

How do I add a variable to a text file?

Hello all, I have a 2.0 GB dataset that I can't load into R, due to memory issues. The dataset itself is in a tab-delimited .txt file with 25 variables. I have a variable I'd like to add to the dataset. How do I do this? Best, Guillaume

Combining 2 columns into 1 column many times in a very large dataset

2010 Feb 28

Combining 2 columns into 1 column many times in a very large dataset

*Combining 2 columns into 1 column many times in a very large dataset* The clumsy solutions I am working on are not going to be very fast if I can get them to work and the true dataset is ~1500 X 45000 so they need to be efficient. I've searched the R help files and the archives for this list and have some possible workable solutions for 2) and 3) but not my question 1). However, I include

Spliting a huge vector

2006 Sep 20

Spliting a huge vector

Dear R users, I have a huge vector that I would like to split into unequal slices. However, the only way I can do this is to create another huge vector to define the groups that are used to split the original vector, e.g. # my vector is this a.vector <- seq(2, by=5, length=100) # indices where I would like to slice my vector cut.values <- c(30, 50, 100, 109, 300, 601, 803) # so I have to

read.table() versus scan()

2011 Jan 28

read.table() versus scan()

I need to import a large number of simple, space-delimited text files with a few columns of data each. The one quirk is that some rows are missing data and some contain junk text at the end of each line. A typical file might look like: a b c d 1 2 3 x 4 5 6 7 8 9 x 1 2 3 x c c 4 5 6 x 7 8 9 x I'm trying to avoid having to pre-process the text files, as they all sit on an ftp site that I

AW: Reading huge chunks of data from MySQL into Windows R

2005 Jun 06

AW: Reading huge chunks of data from MySQL into Windows R

In my (limited) experience R is more powerful concerning data manipulation. An example: I have a vector holding a user id. Some user ids can appear more than once. Doing SELECT COUNT(DISTINCT userid) on MySQL will take approx. 15 min. Doing length(unique(userid)) will take (almost) no time... So I think the other way round will serve best: Do everything in R and avoid using SQL on the database...

[LLVMdev] Undoing DAG Combiner patterns

2013 May 16

[LLVMdev] Undoing DAG Combiner patterns

A better way to handle this is to a td pattern to match "add n, -c" to a subtraction. I believe several targets do something similar to this. Evan On May 16, 2013, at 7:12 AM, Tom Stellard <tom at stellard.net> wrote: > On Thu, May 16, 2013 at 02:03:14AM +0000, Martin Filteau wrote: >> Hi all, >> >> It's the first LLVM backend we do for our asynchronous

templates with same name before extension are cached

2007 Sep 29

templates with same name before extension are cached

Hi all, I was just wondering if this is the intended behavior. Here is my setup: controller def index respond_to do |f| f.xml { render :xml => true } f.html { render :layout => :none } end end In my views I have a file for each type index.herb index.xerb The first request I send is cached and interferes with the other one. For example, if I send an xml request

Lattice: vertical barchart

2007 Jul 10

Lattice: vertical barchart

barchart(Titanic, stack=F) produces a very nice horizontal barchart. Each panel has four groups of two bars. barchart(Titanic, stack=F, horizontal=F) doesn't produce the results I would have expected, as it produces this warning message: Warning message: y should be numeric in: bwplot.formula(x = as.formula(form), data = list(Class = c(1, And it results in each panel having 22 groups of

[LLVMdev] Undoing DAG Combiner patterns

2013 May 16

[LLVMdev] Undoing DAG Combiner patterns

Hi all, It's the first LLVM backend we do for our asynchronous DSP. So, I apologize if this is a trivial question! The target-independent DAG combiner performs the following transformation: sub n, c -> add n, -c For our target, negative constants are more costly to encode. What is the best place to revert to a sub instruction? Kind regards, -- Martin

diag() problem

2005 Oct 19

diag() problem

Hi I have a matrix "u", for which diag() gives an error: u <- structure(c(5.42334674128216, -2.31319389204264, -5.83059042218476, -1.64112369640695, -2.31319389212801, 3.22737617646609, 1.85200668021569, -0.57102273078531, -5.83059042231881, 1.85200668008156, 11.9488923894962, -3.5525537165941, -1.64112369587405, -0.571022730886046, -3.55255371755604,

Reading huge chunks of data from MySQL into Windows R

2005 Jun 06

Reading huge chunks of data from MySQL into Windows R

Dear List, I'm trying to use R under Windows on a huge database in MySQL via ODBC (technical reasons for this...). Now I want to read tables with some 160.000.000 entries into R. I would be lucky if anyone out there has some good hints what to consider concerning memory management. I'm not sure about the best methods reading such huge files into R. for the moment I spilt the whole

Getting names of objects passed with "..."

2007 Jun 01

Getting names of objects passed with "..."

Is there a tidy way to get the names of objects passed to a function via the "..." argument? rbind/cbind does what I want: test.func1 <- function(...) { nms <- rownames(rbind(..., deparse.level=1)) print(nms) } x <- "some stuff" second <- "more stuff" test.func1(first=x, second) [1] "first" "second" The usual

Why is the diag function so slow (for extraction)?

2015 May 12

Why is the diag function so slow (for extraction)?

>>>>> Steve Bronder <sbronder at stevebronder.com> >>>>> on Thu, 7 May 2015 11:49:49 -0400 writes: > Is it possible to replace c() with .subset()? It would be possible, but I think "entirely" wrong. .subset() is documented to be an internal function not to be used "lightly" and more to the point it is documented to *NOT*

diag()

1999 Aug 18

diag()

I would like to suggest a slight modification to diag(). In the case where x is a matrix with both row names and column names the same, it would be reasonable if the resulting vector also had those names. I often use diag() on variance matrices, where this modification is helpful. The modification requires replacing if (is.matrix(x) && nargs() == 1) return(c(x)[1 +

Accessing Package NEWS (NEWS.Rd)

2011 Feb 19

Accessing Package NEWS (NEWS.Rd)

Okay. So, after having spent quite some time never really tracking down why my package NEWS files were unacceptable to readNEWS(), I noticed that there was recent (to me anyway) development that allowed the NEWS to be done as an Rd file. Sweet! A more standard format... I converted a NEWS file in one of my unreleased packages to Rd format. checkNEWS() gave it a thumbs up. But then it went south.

Why is the diag function so slow (for extraction)?

2015 May 05

Why is the diag function so slow (for extraction)?

Looks like the c(x)[...] bit used to be as.matrix(x)[...]. Not sure why the change was made many years ago, but this was before names were handled explicitly. It would definitely be better to not force the duplicate, at least in the case where we are sure c() and [ would not dispatch. Best, luke On Mon, 4 May 2015, peter dalgaard wrote: > >> On 04 May 2015, at 19:59 , franknarf

Setting matrix dimnames in a list

2008 May 08

Setting matrix dimnames in a list

Hey All, I was wondering if I could solicit a little input on what I'm trying to do here. I have a list of matrices, and I want to set their dimnames, but all I can come up with is this: x <- matrix(1:4,2) y <- matrix(5:8,2) z <- list(x,y) nm <- c("a","b") nms <- list(nm,nm) z <- lapply(z,function(x)dimnames(x)<-nms) As you can see, this

Why is the diag function so slow (for extraction)?

2015 May 13

Why is the diag function so slow (for extraction)?

As kindly pointed out to me (oh my decaying gray matter), is.object() is better suited for this test; $ svn diff src/library/base/R/diag.R Index: src/library/base/R/diag.R =================================================================== --- src/library/base/R/diag.R (revision 68345) +++ src/library/base/R/diag.R (working copy) @@ -23,9 +23,11 @@ stop("'nrow' or

similar to: How to separate huge dataset into chunks