Displaying 20 results from an estimated 1000 matches similar to: "drop rare factors"
2011 Jul 12
3
when to use `which'?
when do I need to use which()?
> a <- c(1,2,3,4,5,6)
> a
[1] 1 2 3 4 5 6
> a[a==4]
[1] 4
> a[which(a==4)]
[1] 4
> which(a==4)
[1] 4
> a[which(a>2)]
[1] 3 4 5 6
> a[a>2]
[1] 3 4 5 6
>
seems unnecessary...
--
Sam Steingold (http://sds.podval.org/) on CentOS release 5.6 (Final) X 11.0.60900031
http://jihadwatch.org http://palestinefacts.org http://mideasttruth.com
2012 Mar 14
2
sum(hist$density) == 2 ?!
> x <- rnorm(1000)
> h <- hist(x,plot=FALSE)
> sum(h$density)
[1] 2 ----------------------------- shouldn't it be 1?!
> h <- hist(x,plot=FALSE, breaks=(-4:4))
> sum(h$density)
[1] 1 ----------------------------- now it's 1. why?!
--
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://www.childpsy.net/ http://www.memritv.org
2006 May 11
3
cannot turn some columns in a data frame into factors
Hi,
I have a data frame df and a list of names of columns that I want to
turn into factors:
df.names <- attr(df,"names")
sapply(factors, function (name) {
pos <- match(name,df.names)
if (is.na(pos)) stop(paste(name,": no such column\n"))
df[[pos]] <- factor(df[[pos]])
cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n")
2011 Feb 16
1
confused by lapply
Description:
'lapply' returns a list of the same length as 'X', each element of
which is the result of applying 'FUN' to the corresponding element
of 'X'.
I expect that when I do
> lapply(vec,f)
f would be called _once_ for each component of vec.
this is not what I see:
parse.num <- function (s) {
cat("parse.num1\n"); str(s)
s
2012 Aug 15
3
per-vertex statistics of edge weights
I have a graph with edge and vertex weights, stored in two data frames:
--8<---------------cut here---------------start------------->8---
vertices <- data.frame(vertex=c("a","b","c","d"),weight=c(1,2,1,3))
edges <-
2012 Mar 26
2
Error during wrapup: incorrect number of dimensions
when subsetting a matrix results in a single row, it is converted to a
vector, not a matrix.
how do I avoid this?
1. __GOOD__
> edges <- get.edges(g,E(g))
> edges
[,1] [,2]
[1,] 0 2
[2,] 0 3
[3,] 0 4
[4,] 0 5
[5,] 1 1
[6,] 0 4
[7,] 0 6
[8,] 0 7
[9,] 0 8
[10,] 0 9
[11,] 0 5
[12,] 0 10
[13,] 0 11
2011 Jul 12
1
how to find out whether a string is a factor?
I have two data frames:
> str(ysmd)
'data.frame': 8325 obs. of 6 variables:
$ X.stock : Factor w/ 8325 levels "A","AA","AA-",..: 2702 6547 4118 7664 7587 6350 3341 5640 5107 7589 ...
$ market.cap : num -1.00 2.97e+10 3.54e+08 3.46e+08 -1.00 ...
$ X52.week.low : num 40.2 22.5 27.5 12.2 20.7 ...
$
2012 Aug 10
1
summarize a vector
I have a long numeric vector v (length N) and I want create a shorter
vector of length N/k consisting of sums of k-subsequences of v:
v <- c(1,2,3,4,5,6,7,8,9,10)
N=10, k=3
===> [6,15,24,10]
I can, of course, iterate:
> w <- vector(mode="numeric",length=ceiling(N/k))
> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))
(modulo boundary conditions)
but I wonder if
2012 Nov 19
2
generated list element names
How can I create lists with element names created on the fly?
--8<---------------cut here---------------start------------->8---
> list (foo = 10)
$foo
[1] 10
> list ("foo" = 10)
$foo
[1] 10
> list (paste("f","oo",sep="") = 10)
Error: unexpected '=' in "list (paste("f","oo",sep="") ="
2013 Sep 18
2
strsplit with a vector split argument
Hi,
I find this behavior unexpected:
--8<---------------cut here---------------start------------->8---
> strsplit(c("a,b;c","d;e,f"),c(",",";"))
[[1]]
[1] "a" "b;c"
[[2]]
[1] "d" "e,f"
--8<---------------cut here---------------end--------------->8---
I thought that it should be identical to this:
2012 Oct 07
2
a merge() problem
I know it does not look very good - using the same column names to mean
different things in different data frames, but here you go:
--8<---------------cut here---------------start------------->8---
> x <- data.frame(a=c(1,2,3),b=c(4,5,6))
> y <- data.frame(b=c(1,2),a=c("a","b"))
>
2012 Aug 28
5
variable scope
At the end of a for loop its variables are still present:
for (i in 1:10) {
x <- vector(length=100000000)
}
ls()
will print "i" and "x".
this means that at the end of the for loop body I have to write
rm(x)
gc()
is there a more elegant way to handle this?
Thanks.
--
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
2012 Jan 24
2
read.table: how to ignore errors?
I get this error from read.table():
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 234 did not have 8 elements
The error is genuine (an extra field separator between 1st and 2nd element).
1. is there a way to see this bad line 234 from R without diving into the file?
2. is there a way to ignore the bad lines and get the data from the good
lines only (I do
2012 Aug 30
3
apply --> data.frame
Is there a way for an apply-type function to return a data frame?
the closest thing I think of is
foo <- as.data.frame(sapply(...))
names(foo) <- c(....)
is there a more "elegant" way?
Thanks!
--
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://palestinefacts.org http://dhimmi.com
http://honestreporting.com
2012 Sep 19
2
drop zero slots from table?
I find myself doing
--8<---------------cut here---------------start------------->8---
tab <- table(...)
tab <- tab[tab > 0]
tab <- sort(tab,decreasing=TRUE)
--8<---------------cut here---------------end--------------->8---
all the time.
I am wondering if the "drop 0" (and maybe even sort?) can be effected by
some magic argument to table() which I fail to discover
2012 Aug 27
1
matrix.csr %*% matrix --> matrix
When a sparse matrix is multiplied by a regular one, the result is
usually not sparse. However, when matrix.csr is multiplied by a regular
matrix in R, a matrix.csr is produced.
Is there a way to avoid this?
Thanks!
--
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://palestinefacts.org http://truepeace.org
2011 Jul 11
1
plot means ?
Hi,
I need this plot:
given: x,y - numerical vectors of length N
plot xi vs mean(yj such that |xj - xi|<epsilon)
(running mean?)
alternatively, discretize X as if for histogram plotting and plot mean y
over the center of the histogram group.
is there a simple way?
thanks!
--
Sam Steingold (http://sds.podval.org/) on CentOS release 5.6 (Final) X 11.0.60900031
http://thereligionofpeace.com
2011 Apr 07
1
Assigning a larger number of levels to a factor that has fewer levels
Hello!
I have larger and a smaller data frame with 1 factor in each - it's
the same factor:
large.frame<-data.frame(myfactor=LETTERS[1:10])
small.frame<-data.frame(myfactor=LETTERS[c(9,7,5,3,1)])
levels(large.frame$myfactor)
levels(small.frame$myfactor)
table(large.frame$myfactor)
table(small.frame$myfactor)
myfactor has 10 levels in large.frame and 5 levels in small.frame. All
5
2012 Oct 16
5
uniq -c
I need an analogue of "uniq -c" for a data frame.
xtabs(), although dog slow, would have footed the bill nicely:
--8<---------------cut here---------------start------------->8---
> x <- data.frame(a=1:32,b=1:32,c=1:32,d=1:32,e=1:32)
> system.time(subset(as.data.frame(xtabs( ~. , x )), Freq != 0 ))
user system elapsed
12.788 4.288 17.224
--8<---------------cut
2012 Aug 27
1
write.matrix.csr data conversion
> write.matrix.csr(mx, y = y, file = file)
> table(y)
0 1
5194394 23487
$ cut -d' ' -f1 f | sort | uniq -c
23487 2
5194394 1
i.e., 0 is written as 1 and 1 is written as 2.
why?
is there a way to disable this?
--
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://palestinefacts.org