Displaying 20 results from an estimated 1000 matches similar to: "str on large data.frame is slow on factors with many levels"
2012 Aug 27
1
write.matrix.csr data conversion
> write.matrix.csr(mx, y = y, file = file)
> table(y)
0 1
5194394 23487
$ cut -d' ' -f1 f | sort | uniq -c
23487 2
5194394 1
i.e., 0 is written as 1 and 1 is written as 2.
why?
is there a way to disable this?
--
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://palestinefacts.org
2012 Feb 13
1
entropy package: how to compute mutual information?
suppose I have two factor vectors:
x <- as.factor(c("a","b","a","c","b","c"))
y <- as.factor(c("b","a","a","c","c","b"))
I can compute their entropies:
entropy(table(x))
[1] 1.098612
using
library(entropy)
but it is not clear how to compute their mutual information
2012 Mar 14
2
sum(hist$density) == 2 ?!
> x <- rnorm(1000)
> h <- hist(x,plot=FALSE)
> sum(h$density)
[1] 2 ----------------------------- shouldn't it be 1?!
> h <- hist(x,plot=FALSE, breaks=(-4:4))
> sum(h$density)
[1] 1 ----------------------------- now it's 1. why?!
--
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://www.childpsy.net/ http://www.memritv.org
2012 Aug 27
1
matrix.csr %*% matrix --> matrix
When a sparse matrix is multiplied by a regular one, the result is
usually not sparse. However, when matrix.csr is multiplied by a regular
matrix in R, a matrix.csr is produced.
Is there a way to avoid this?
Thanks!
--
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://palestinefacts.org http://truepeace.org
2012 Apr 04
2
recover lost global function
Since R has the same namespace for functions and variables,
> c <- 1
kills the global function, which can be restored by
> c <- get("c",mode="function")
Is there a way to prevent R from overriding globals
or at least warning when I do that
or at least warning when I replace a functional value with non-functional?
thanks.
--
Sam Steingold (http://sds.podval.org/)
2012 Jul 13
1
LiblineaR: read/write model files?
How do I read/write liblinear models to files?
E.g., if I train a model using the command line interface, I might want
to load it into R to look the histogram of the weights.
Or I might want to train a model in R and then apply it using a command
line interface.
--
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/
2012 Aug 30
3
apply --> data.frame
Is there a way for an apply-type function to return a data frame?
the closest thing I think of is
foo <- as.data.frame(sapply(...))
names(foo) <- c(....)
is there a more "elegant" way?
Thanks!
--
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://palestinefacts.org http://dhimmi.com
http://honestreporting.com
2012 Sep 20
1
aggregate help
I want to count attributes of IDs:
--8<---------------cut here---------------start------------->8---
z <- data.frame(id=c(10,20,10,30,10,20),
a1=c("a","b","a","c","b","b"),
a2=c("x","y","x","z","z","y"),
2011 Feb 16
2
create a data frame with the given column names
how do I create a data frame with the given column names
_NOT KNOWN IN ADVANCE_?
i.e., I have a vector of strings for names and I want to get an _EMPTY_
data frame with these column names.
is it at all possible?
--
Sam Steingold (http://sds.podval.org/) on CentOS release 5.3 (Final)
http://openvotingconsortium.org http://pmw.org.il http://memri.org
http://mideasttruth.com
2013 Jan 18
5
select rows with identical columns from a data frame
I have a data frame with several columns.
I want to select the rows with no NAs (as with complete.cases)
and all columns identical.
E.g., for
--8<---------------cut here---------------start------------->8---
> f <- data.frame(a=c(1,NA,NA,4),b=c(1,NA,3,40),c=c(1,NA,5,40))
> f
a b c
1 1 1 1
2 NA NA NA
3 NA 3 5
4 4 40 40
--8<---------------cut
2013 Apr 21
1
cedta decided 'igraph' wasn't data.table aware
Hi, what does this mean?
--8<---------------cut here---------------start------------->8---
> graph <- graph.data.frame(merged[!v,], vertices=ve, directed=FALSE)
cedta decided 'igraph' wasn't data.table aware
cedta decided 'igraph' wasn't data.table aware
cedta decided 'igraph' wasn't data.table aware
cedta decided 'igraph' wasn't
2012 Aug 15
3
per-vertex statistics of edge weights
I have a graph with edge and vertex weights, stored in two data frames:
--8<---------------cut here---------------start------------->8---
vertices <- data.frame(vertex=c("a","b","c","d"),weight=c(1,2,1,3))
edges <-
2011 Mar 18
1
time series from timed data
Hi,
I have data with multiple sub-second entries:
2011/03/15 09:32:15.035619,-0.403103,1.09664,48.6,126.92,117.32
2011/03/15 09:32:15.069331,-0.39851,1.09874,48.6,126.92,117.32
2011/03/15 09:32:15.289135,-0.402463,1.10084,48.59,126.92,117.32
2011/03/15 09:32:15.296110,-0.450244,1.10063,48.59,126.92,117.32
2011/03/15 09:32:15.451358,-0.438813,1.10273,48.59,126.93,117.32
2011/03/15
2012 Mar 20
2
igraph: decompose.graph: Error: protect(): protection stack overflow
I just got this error:
> library(igraph)
> comp <- decompose.graph(gr)
Error: protect(): protection stack overflow
Error: protect(): protection stack overflow
>
what can I do?
the digraph is, indeed, large (300,000 vertexes), but there are very
many very small components (which I would rather not discard).
PS. the doc for decompose.graph does not say which mode is the default.
--
2012 Sep 06
2
merge a list of data frames
I have a list of data frames:
> str(data)
List of 4
$ :'data.frame': 700773 obs. of 3 variables:
..$ V1: chr [1:700773] "200130446465779" "200070050127778" "200030633708779" "200010587002779" ...
..$ V2: int [1:700773] 0 0 0 0 0 0 0 0 0 0 ...
..$ V3: num [1:700773] 1 1 1 1 1 ...
$ :'data.frame': 700773 obs. of 3 variables:
..$
2012 Feb 10
2
the value of the last expression
Is there an analogue of common lisp "*" variable which contains the
value of the last expression?
E.g., in lisp:
> (+ 1 2)
3
> *
3
I wish I could recover the value of the last expression without
re-evaluating it.
thanks
--
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://www.childpsy.net/ http://camera.org http://ffii.org
2012 Oct 18
3
how to concatenate factor vectors?
How do I concatenate two vectors of factors?
--8<---------------cut here---------------start------------->8---
> a <- factor(5:1,levels=1:9)
> b <- factor(9:1,levels=1:9)
> str(c(a,b))
int [1:14] 5 4 3 2 1 9 8 7 6 5 ...
> str(unlist(list(a,b),use.names=FALSE))
Factor w/ 9 levels "1","2","3","4",..: 5 4 3 2 1 9 8 7 6 5 ...
2006 Mar 17
6
removing NA from a data frame
Hi,
It appears that deal does not support missing values (NA), so I need to
remove them (NAs) from my data frame.
how do I do this?
(I am very new to R, so a detailed step-by-step
explanation with code samples would be nice).
Some columns (variables) have quite a few NAs, so I would rather drop
the whole column than sacrifice all the rows (observations) which have
NA in that column.
How do I
2011 Jul 12
3
when to use `which'?
when do I need to use which()?
> a <- c(1,2,3,4,5,6)
> a
[1] 1 2 3 4 5 6
> a[a==4]
[1] 4
> a[which(a==4)]
[1] 4
> which(a==4)
[1] 4
> a[which(a>2)]
[1] 3 4 5 6
> a[a>2]
[1] 3 4 5 6
>
seems unnecessary...
--
Sam Steingold (http://sds.podval.org/) on CentOS release 5.6 (Final) X 11.0.60900031
http://jihadwatch.org http://palestinefacts.org http://mideasttruth.com
2012 Feb 08
4
"unsparse" a vector
Suppose I have a vector of strings:
c("A1B2","A3C4","B5","C6A7B8")
[1] "A1B2" "A3C4" "B5" "C6A7B8"
where each string is a sequence of <column><value> pairs
(fixed width, in this example both value and name are 1 character, in
reality the column name is 6 chars and value is 2 digits).
I need to