thr3ads.net - similar to: "drop rare factors"

Displaying 20 results from an estimated 1000 matches similar to: "drop rare factors"

when to use `which'?

2011 Jul 12

when to use `which'?

when do I need to use which()? > a <- c(1,2,3,4,5,6) > a [1] 1 2 3 4 5 6 > a[a==4] [1] 4 > a[which(a==4)] [1] 4 > which(a==4) [1] 4 > a[which(a>2)] [1] 3 4 5 6 > a[a>2] [1] 3 4 5 6 > seems unnecessary... -- Sam Steingold (http://sds.podval.org/) on CentOS release 5.6 (Final) X 11.0.60900031 http://jihadwatch.org http://palestinefacts.org http://mideasttruth.com

sum(hist$density) == 2 ?!

2012 Mar 14

sum(hist$density) == 2 ?!

> x <- rnorm(1000) > h <- hist(x,plot=FALSE) > sum(h$density) [1] 2 ----------------------------- shouldn't it be 1?! > h <- hist(x,plot=FALSE, breaks=(-4:4)) > sum(h$density) [1] 1 ----------------------------- now it's 1. why?! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000 http://www.childpsy.net/ http://www.memritv.org

cannot turn some columns in a data frame into factors

2006 May 11

cannot turn some columns in a data frame into factors

Hi, I have a data frame df and a list of names of columns that I want to turn into factors: df.names <- attr(df,"names") sapply(factors, function (name) { pos <- match(name,df.names) if (is.na(pos)) stop(paste(name,": no such column\n")) df[[pos]] <- factor(df[[pos]]) cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n")

confused by lapply

2011 Feb 16

confused by lapply

Description: 'lapply' returns a list of the same length as 'X', each element of which is the result of applying 'FUN' to the corresponding element of 'X'. I expect that when I do > lapply(vec,f) f would be called _once_ for each component of vec. this is not what I see: parse.num <- function (s) { cat("parse.num1\n"); str(s) s

per-vertex statistics of edge weights

2012 Aug 15

per-vertex statistics of edge weights

I have a graph with edge and vertex weights, stored in two data frames: --8<---------------cut here---------------start------------->8--- vertices <- data.frame(vertex=c("a","b","c","d"),weight=c(1,2,1,3)) edges <-

Error during wrapup: incorrect number of dimensions

2012 Mar 26

Error during wrapup: incorrect number of dimensions

when subsetting a matrix results in a single row, it is converted to a vector, not a matrix. how do I avoid this? 1. __GOOD__ > edges <- get.edges(g,E(g)) > edges [,1] [,2] [1,] 0 2 [2,] 0 3 [3,] 0 4 [4,] 0 5 [5,] 1 1 [6,] 0 4 [7,] 0 6 [8,] 0 7 [9,] 0 8 [10,] 0 9 [11,] 0 5 [12,] 0 10 [13,] 0 11

how to find out whether a string is a factor?

2011 Jul 12

how to find out whether a string is a factor?

I have two data frames: > str(ysmd) 'data.frame': 8325 obs. of 6 variables: $ X.stock : Factor w/ 8325 levels "A","AA","AA-",..: 2702 6547 4118 7664 7587 6350 3341 5640 5107 7589 ... $ market.cap : num -1.00 2.97e+10 3.54e+08 3.46e+08 -1.00 ... $ X52.week.low : num 40.2 22.5 27.5 12.2 20.7 ... $

summarize a vector

2012 Aug 10

summarize a vector

I have a long numeric vector v (length N) and I want create a shorter vector of length N/k consisting of sums of k-subsequences of v: v <- c(1,2,3,4,5,6,7,8,9,10) N=10, k=3 ===> [6,15,24,10] I can, of course, iterate: > w <- vector(mode="numeric",length=ceiling(N/k)) > for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k)) (modulo boundary conditions) but I wonder if

generated list element names

2012 Nov 19

generated list element names

How can I create lists with element names created on the fly? --8<---------------cut here---------------start------------->8--- > list (foo = 10) $foo [1] 10 > list ("foo" = 10) $foo [1] 10 > list (paste("f","oo",sep="") = 10) Error: unexpected '=' in "list (paste("f","oo",sep="") ="

strsplit with a vector split argument

2013 Sep 18

strsplit with a vector split argument

Hi, I find this behavior unexpected: --8<---------------cut here---------------start------------->8--- > strsplit(c("a,b;c","d;e,f"),c(",",";")) [[1]] [1] "a" "b;c" [[2]] [1] "d" "e,f" --8<---------------cut here---------------end--------------->8--- I thought that it should be identical to this:

a merge() problem

2012 Oct 07

a merge() problem

I know it does not look very good - using the same column names to mean different things in different data frames, but here you go: --8<---------------cut here---------------start------------->8--- > x <- data.frame(a=c(1,2,3),b=c(4,5,6)) > y <- data.frame(b=c(1,2),a=c("a","b")) >

variable scope

2012 Aug 28

variable scope

At the end of a for loop its variables are still present: for (i in 1:10) { x <- vector(length=100000000) } ls() will print "i" and "x". this means that at the end of the for loop body I have to write rm(x) gc() is there a more elegant way to handle this? Thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000

read.table: how to ignore errors?

2012 Jan 24

read.table: how to ignore errors?

I get this error from read.table(): Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 234 did not have 8 elements The error is genuine (an extra field separator between 1st and 2nd element). 1. is there a way to see this bad line 234 from R without diving into the file? 2. is there a way to ignore the bad lines and get the data from the good lines only (I do

apply --> data.frame

2012 Aug 30

apply --> data.frame

Is there a way for an apply-type function to return a data frame? the closest thing I think of is foo <- as.data.frame(sapply(...)) names(foo) <- c(....) is there a more "elegant" way? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org http://dhimmi.com http://honestreporting.com

drop zero slots from table?

2012 Sep 19

drop zero slots from table?

I find myself doing --8<---------------cut here---------------start------------->8--- tab <- table(...) tab <- tab[tab > 0] tab <- sort(tab,decreasing=TRUE) --8<---------------cut here---------------end--------------->8--- all the time. I am wondering if the "drop 0" (and maybe even sort?) can be effected by some magic argument to table() which I fail to discover

matrix.csr %*% matrix --> matrix

2012 Aug 27

matrix.csr %*% matrix --> matrix

When a sparse matrix is multiplied by a regular one, the result is usually not sparse. However, when matrix.csr is multiplied by a regular matrix in R, a matrix.csr is produced. Is there a way to avoid this? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org http://truepeace.org

plot means ?

2011 Jul 11

plot means ?

Hi, I need this plot: given: x,y - numerical vectors of length N plot xi vs mean(yj such that |xj - xi|<epsilon) (running mean?) alternatively, discretize X as if for histogram plotting and plot mean y over the center of the histogram group. is there a simple way? thanks! -- Sam Steingold (http://sds.podval.org/) on CentOS release 5.6 (Final) X 11.0.60900031 http://thereligionofpeace.com

Assigning a larger number of levels to a factor that has fewer levels

2011 Apr 07

Assigning a larger number of levels to a factor that has fewer levels

Hello! I have larger and a smaller data frame with 1 factor in each - it's the same factor: large.frame<-data.frame(myfactor=LETTERS[1:10]) small.frame<-data.frame(myfactor=LETTERS[c(9,7,5,3,1)]) levels(large.frame$myfactor) levels(small.frame$myfactor) table(large.frame$myfactor) table(small.frame$myfactor) myfactor has 10 levels in large.frame and 5 levels in small.frame. All 5

uniq -c

2012 Oct 16

uniq -c

I need an analogue of "uniq -c" for a data frame. xtabs(), although dog slow, would have footed the bill nicely: --8<---------------cut here---------------start------------->8--- > x <- data.frame(a=1:32,b=1:32,c=1:32,d=1:32,e=1:32) > system.time(subset(as.data.frame(xtabs( ~. , x )), Freq != 0 )) user system elapsed 12.788 4.288 17.224 --8<---------------cut

write.matrix.csr data conversion

2012 Aug 27

write.matrix.csr data conversion

> write.matrix.csr(mx, y = y, file = file) > table(y) 0 1 5194394 23487 $ cut -d' ' -f1 f | sort | uniq -c 23487 2 5194394 1 i.e., 0 is written as 1 and 1 is written as 2. why? is there a way to disable this? -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org

similar to: drop rare factors