thr3ads.net - similar to: "aggregate() runs out of memory"

Displaying 20 results from an estimated 2000 matches similar to: "aggregate() runs out of memory"

2012 Dec 04

list to matrix?

How do I convert a list to a matrix? --8<---------------cut here---------------start------------->8--- list(c(50000, 101), c(1e+05, 46), c(150000, 31), c(2e+05, 17), c(250000, 19), c(3e+05, 11), c(350000, 12), c(4e+05, 25), c(450000, 19), c(5e+05, 16)) as.matrix(a) [,1] [1,] Numeric,2 [2,] Numeric,2 [3,] Numeric,2 [4,] Numeric,2 [5,] Numeric,2 [6,] Numeric,2 [7,]

drop zero slots from table?

2012 Sep 19

drop zero slots from table?

I find myself doing --8<---------------cut here---------------start------------->8--- tab <- table(...) tab <- tab[tab > 0] tab <- sort(tab,decreasing=TRUE) --8<---------------cut here---------------end--------------->8--- all the time. I am wondering if the "drop 0" (and maybe even sort?) can be effected by some magic argument to table() which I fail to discover

cor() on sets of vectors

2012 Feb 23

cor() on sets of vectors

suppose I have two sets of vectors: x1,x2,...,xN and y1,y2,...,yN. I want N correlations: cor(x1,y1), cor(x2,y2), ..., cor(xN,yN). my sets of vectors are arranged as data frames x & y (vector=column): x <- data.frame(a=rnorm(10),b=rnorm(10),c=rnorm(10)) y <- data.frame(d=rnorm(10),e=rnorm(10),f=rnorm(10)) cor(x,y) returns a _matrix_ of all pairwise correlations: cor(x,y)

non-consing count

2013 Jan 04

non-consing count

Hi, to count vector elements with some property, the standard idiom seems to be length(which): --8<---------------cut here---------------start------------->8--- x <- c(1,1,0,0,0) count.0 <- length(which(x == 0)) --8<---------------cut here---------------end--------------->8--- however, this approach allocates and discards 2 vectors: a logical vector of length=length(x) and an

variable scope

2012 Aug 28

variable scope

At the end of a for loop its variables are still present: for (i in 1:10) { x <- vector(length=100000000) } ls() will print "i" and "x". this means that at the end of the for loop body I have to write rm(x) gc() is there a more elegant way to handle this? Thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000

how to concatenate factor vectors?

2012 Oct 18

how to concatenate factor vectors?

How do I concatenate two vectors of factors? --8<---------------cut here---------------start------------->8--- > a <- factor(5:1,levels=1:9) > b <- factor(9:1,levels=1:9) > str(c(a,b)) int [1:14] 5 4 3 2 1 9 8 7 6 5 ... > str(unlist(list(a,b),use.names=FALSE)) Factor w/ 9 levels "1","2","3","4",..: 5 4 3 2 1 9 8 7 6 5 ...

as.data.frame(do.call(rbind,lapply)) produces something weird

2012 Nov 09

as.data.frame(do.call(rbind,lapply)) produces something weird

The following code: --8<---------------cut here---------------start------------->8--- > myfun <- function (x) list(x=x,y=x*x) > z <- as.data.frame(do.call(rbind,lapply(1:3,function(x) c(a=paste("a",x,sep=""),as.list(unlist(list(b=myfun(x),c=myfun(x*x*x)))))))) > z a b.x b.y c.x c.y 1 a1 1 1 1 1 2 a2 2 4 8 64 3 a3 3 9 27 729

uniq -c

2012 Oct 16

uniq -c

I need an analogue of "uniq -c" for a data frame. xtabs(), although dog slow, would have footed the bill nicely: --8<---------------cut here---------------start------------->8--- > x <- data.frame(a=1:32,b=1:32,c=1:32,d=1:32,e=1:32) > system.time(subset(as.data.frame(xtabs( ~. , x )), Freq != 0 )) user system elapsed 12.788 4.288 17.224 --8<---------------cut

vectorization & modifying globals in functions

2012 Dec 27

vectorization & modifying globals in functions

I have the following code: --8<---------------cut here---------------start------------->8--- d <- rep(10,10) for (i in 1:100) { a <- sample.int(length(d), size = 2) if (d[a[1]] >= 1) { d[a[1]] <- d[a[1]] - 1 d[a[2]] <- d[a[2]] + 1 } } --8<---------------cut here---------------end--------------->8--- it does what I want, i.e., modified vector d 100 times.

count.fields inconsistent with read.table?

2012 Feb 24

count.fields inconsistent with read.table?

Hi, batch is a vector of lines returned by readLines from a NL-line-terminated file, here is the relevant section: ========================================================= AA BB CC DD EE FF GG H H JJ KK LL MM ========================================================= as you can see, a line is corrupt; two CRLF's are inserted. This is okay, I drop the bad lines, at least I hope I do:

plot means ?

2011 Jul 11

plot means ?

Hi, I need this plot: given: x,y - numerical vectors of length N plot xi vs mean(yj such that |xj - xi|<epsilon) (running mean?) alternatively, discretize X as if for histogram plotting and plot mean y over the center of the histogram group. is there a simple way? thanks! -- Sam Steingold (http://sds.podval.org/) on CentOS release 5.6 (Final) X 11.0.60900031 http://thereligionofpeace.com

cannot coerce class '"rle"' into a data.frame

2012 Oct 16

cannot coerce class '"rle"' into a data.frame

why? > rle Run Length Encoding lengths: int [1:1650061] 2 2 8 2 4 5 6 3 26 46 ... values : chr [1:1650061] "4bbf9e94cbceb70c BG bg" "4fbbf2c67e0fb867 SK sk" ... > as.data.frame(rle) Error in as.data.frame.default(vertices.rle) : cannot coerce class '"rle"' into a data.frame it seems that rle.df <-

LiblineaR: read/write model files?

2012 Jul 13

LiblineaR: read/write model files?

How do I read/write liblinear models to files? E.g., if I train a model using the command line interface, I might want to load it into R to look the histogram of the weights. Or I might want to train a model in R and then apply it using a command line interface. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/

select rows with identical columns from a data frame

2013 Jan 18

select rows with identical columns from a data frame

I have a data frame with several columns. I want to select the rows with no NAs (as with complete.cases) and all columns identical. E.g., for --8<---------------cut here---------------start------------->8--- > f <- data.frame(a=c(1,NA,NA,4),b=c(1,NA,3,40),c=c(1,NA,5,40)) > f a b c 1 1 1 1 2 NA NA NA 3 NA 3 5 4 4 40 40 --8<---------------cut

per-vertex statistics of edge weights

2012 Aug 15

per-vertex statistics of edge weights

I have a graph with edge and vertex weights, stored in two data frames: --8<---------------cut here---------------start------------->8--- vertices <- data.frame(vertex=c("a","b","c","d"),weight=c(1,2,1,3)) edges <-

cedta decided 'igraph' wasn't data.table aware

2013 Apr 21

cedta decided 'igraph' wasn't data.table aware

Hi, what does this mean? --8<---------------cut here---------------start------------->8--- > graph <- graph.data.frame(merged[!v,], vertices=ve, directed=FALSE) cedta decided 'igraph' wasn't data.table aware cedta decided 'igraph' wasn't data.table aware cedta decided 'igraph' wasn't data.table aware cedta decided 'igraph' wasn't

igraph: decompose.graph: Error: protect(): protection stack overflow

2012 Mar 20

igraph: decompose.graph: Error: protect(): protection stack overflow

I just got this error: > library(igraph) > comp <- decompose.graph(gr) Error: protect(): protection stack overflow Error: protect(): protection stack overflow > what can I do? the digraph is, indeed, large (300,000 vertexes), but there are very many very small components (which I would rather not discard). PS. the doc for decompose.graph does not say which mode is the default. --

where are these NAs coming from?

2012 Sep 19

where are these NAs coming from?

I see this: --8<---------------cut here---------------start------------->8--- > length(which(is.na(z$language))) [1] 0 > locals <- z[z$country == mycountry,] > length(which(is.na(locals$language))) [1] 229 --8<---------------cut here---------------end--------------->8--- where are those locals without the language coming from?! -- Sam Steingold (http://sds.podval.org/) on

apply --> data.frame

2012 Aug 30

apply --> data.frame

Is there a way for an apply-type function to return a data frame? the closest thing I think of is foo <- as.data.frame(sapply(...)) names(foo) <- c(....) is there a more "elegant" way? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org http://dhimmi.com http://honestreporting.com

"unsparse" a vector

2012 Feb 08

"unsparse" a vector

Suppose I have a vector of strings: c("A1B2","A3C4","B5","C6A7B8") [1] "A1B2" "A3C4" "B5" "C6A7B8" where each string is a sequence of <column><value> pairs (fixed width, in this example both value and name are 1 character, in reality the column name is 6 chars and value is 2 digits). I need to

similar to: aggregate() runs out of memory