Displaying 20 results from an estimated 1000 matches similar to: "summarize a vector"
2012 Sep 19
drop zero slots from table?
I find myself doing
--8<---------------cut here---------------start------------->8---
tab <- table(...)
tab <- tab[tab > 0]
tab <- sort(tab,decreasing=TRUE)
--8<---------------cut here---------------end--------------->8---
all the time.
I am wondering if the "drop 0" (and maybe even sort?) can be effected by
some magic argument to table() which I fail to discover
2012 Dec 04
list to matrix?
How do I convert a list to a matrix?
--8<---------------cut here---------------start------------->8---
list(c(50000, 101), c(1e+05, 46), c(150000, 31), c(2e+05, 17),
c(250000, 19), c(3e+05, 11), c(350000, 12), c(4e+05, 25),
c(450000, 19), c(5e+05, 16))
[1,] Numeric,2
[2,] Numeric,2
[3,] Numeric,2
[4,] Numeric,2
[5,] Numeric,2
[6,] Numeric,2
2012 Aug 28
variable scope
At the end of a for loop its variables are still present:
for (i in 1:10) {
x <- vector(length=100000000)
will print "i" and "x".
this means that at the end of the for loop body I have to write
is there a more elegant way to handle this?
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
2012 Sep 14
aggregate() runs out of memory
I have a large data.frame Z (2,424,185,944 bytes, 10,256,441 rows, 17 columns).
I want to get the result of
table(aggregate(Z$V1, FUN = length, by = list(id=Z$V2))$x)
alas, aggregate has been running for ~30 minute, RSS is 14G, VIRT is
24.3G, and no end in sight.
both V1 and V2 are characters (not factors).
Is there anything I could do to speed this up?
Sam Steingold
2013 Apr 21
cedta decided 'igraph' wasn't data.table aware
Hi, what does this mean?
--8<---------------cut here---------------start------------->8---
> graph <- graph.data.frame(merged[!v,], vertices=ve, directed=FALSE)
cedta decided 'igraph' wasn't data.table aware
cedta decided 'igraph' wasn't data.table aware
cedta decided 'igraph' wasn't data.table aware
cedta decided 'igraph' wasn't
2012 Nov 07
LiblineaR: accept sparse matrices
It would be nice if LiblineaR() accepted data in the form of a sparse
matrix (it does not accept whatever e1071::read.matrix.csr returns).
It would also be nice if there were functions to read/write files in the
native liblinear file format; I am sure the original liblinear library
provides at least the input code.
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04
2012 Oct 16
uniq -c
I need an analogue of "uniq -c" for a data frame.
xtabs(), although dog slow, would have footed the bill nicely:
--8<---------------cut here---------------start------------->8---
> x <- data.frame(a=1:32,b=1:32,c=1:32,d=1:32,e=1:32)
> system.time(subset(as.data.frame(xtabs( ~. , x )), Freq != 0 ))
user system elapsed
12.788 4.288 17.224
2012 Oct 15
what to use for sna/graphs?
What do people use for SNA/graph analysis in R?
So far I have been using igraph (it implements the Louvain community
detection algorithm as multilevel.community, which is the killer feature
for me).
However, igraph is severely lacking in visualization, which I also need.
graphviz & gephi are alleged to be good at visualization, but,
apparently, not so for analysis (specifically, community
2012 Feb 24
count.fields inconsistent with read.table?
batch is a vector of lines returned by readLines from a
NL-line-terminated file, here is the relevant section:
as you can see, a line is corrupt; two CRLF's are inserted.
This is okay, I drop the bad lines, at least I hope I do:
2013 Jan 04
non-consing count
to count vector elements with some property, the standard idiom seems to
be length(which):
--8<---------------cut here---------------start------------->8---
x <- c(1,1,0,0,0)
count.0 <- length(which(x == 0))
--8<---------------cut here---------------end--------------->8---
however, this approach allocates and discards 2 vectors: a logical
vector of length=length(x) and an
2012 Nov 19
generated list element names
How can I create lists with element names created on the fly?
--8<---------------cut here---------------start------------->8---
> list (foo = 10)
[1] 10
> list ("foo" = 10)
[1] 10
> list (paste("f","oo",sep="") = 10)
Error: unexpected '=' in "list (paste("f","oo",sep="") ="
2012 Jul 13
LiblineaR: read/write model files?
How do I read/write liblinear models to files?
E.g., if I train a model using the command line interface, I might want
to load it into R to look the histogram of the weights.
Or I might want to train a model in R and then apply it using a command
line interface.
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
2012 Aug 24
SparseM buglet
read.matrix.csr does not close the connection:
> library('SparseM')
Package SparseM (0.96) loaded.
> read.matrix.csr(foo)
Warning message:
closing unused connection 3 (foo)
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://truepeace.org http://camera.org
http://pmw.org.il http://think-israel.org
2012 Oct 18
how to concatenate factor vectors?
How do I concatenate two vectors of factors?
--8<---------------cut here---------------start------------->8---
> a <- factor(5:1,levels=1:9)
> b <- factor(9:1,levels=1:9)
> str(c(a,b))
int [1:14] 5 4 3 2 1 9 8 7 6 5 ...
> str(unlist(list(a,b),use.names=FALSE))
Factor w/ 9 levels "1","2","3","4",..: 5 4 3 2 1 9 8 7 6 5 ...
2012 Aug 15
per-vertex statistics of edge weights
I have a graph with edge and vertex weights, stored in two data frames:
--8<---------------cut here---------------start------------->8---
vertices <- data.frame(vertex=c("a","b","c","d"),weight=c(1,2,1,3))
edges <-
2012 Oct 16
cannot coerce class '"rle"' into a data.frame
> rle
Run Length Encoding
lengths: int [1:1650061] 2 2 8 2 4 5 6 3 26 46 ...
values : chr [1:1650061] "4bbf9e94cbceb70c BG bg" "4fbbf2c67e0fb867 SK sk" ...
> as.data.frame(rle)
Error in as.data.frame.default(vertices.rle) :
cannot coerce class '"rle"' into a data.frame
it seems that
rle.df <-
2012 Feb 23
cor() on sets of vectors
suppose I have two sets of vectors: x1,x2,...,xN and y1,y2,...,yN.
I want N correlations: cor(x1,y1), cor(x2,y2), ..., cor(xN,yN).
my sets of vectors are arranged as data frames x & y (vector=column):
x <- data.frame(a=rnorm(10),b=rnorm(10),c=rnorm(10))
y <- data.frame(d=rnorm(10),e=rnorm(10),f=rnorm(10))
cor(x,y) returns a _matrix_ of all pairwise correlations:
2012 Mar 13
multi-histogram plotting
I have a vector x:
2 3 4 5 6 7 8 9 10 11 12 13 14
45547 11835 4692 2241 1386 820 593 425 298 239 176 158 115
15 16 17 18 19 20 21 22 23 24 25 26 27
94 88 76 67 47 46 40 20 30 22 20 33 14
28 29 30 31 32 33 34 35 36
2012 Dec 27
vectorization & modifying globals in functions
I have the following code:
--8<---------------cut here---------------start------------->8---
d <- rep(10,10)
for (i in 1:100) {
a <- sample.int(length(d), size = 2)
if (d[a[1]] >= 1) {
d[a[1]] <- d[a[1]] - 1
d[a[2]] <- d[a[2]] + 1
--8<---------------cut here---------------end--------------->8---
it does what I want, i.e., modified vector d 100 times.
2012 Nov 09
as.data.frame(do.call(rbind,lapply)) produces something weird
The following code:
--8<---------------cut here---------------start------------->8---
> myfun <- function (x) list(x=x,y=x*x)
> z <- as.data.frame(do.call(rbind,lapply(1:3,function(x) c(a=paste("a",x,sep=""),as.list(unlist(list(b=myfun(x),c=myfun(x*x*x))))))))
> z
a b.x b.y c.x c.y
1 a1 1 1 1 1
2 a2 2 4 8 64
3 a3 3 9 27 729