thr3ads.net - similar to: "plot with a regression line(s)"

Displaying 20 results from an estimated 4000 matches similar to: "plot with a regression line(s)"

2012 Feb 10

the value of the last expression

Is there an analogue of common lisp "*" variable which contains the value of the last expression? E.g., in lisp: > (+ 1 2) 3 > * 3 I wish I could recover the value of the last expression without re-evaluating it. thanks -- Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000 http://www.childpsy.net/ http://camera.org http://ffii.org

cannot turn some columns in a data frame into factors

2006 May 11

cannot turn some columns in a data frame into factors

Hi, I have a data frame df and a list of names of columns that I want to turn into factors: df.names <- attr(df,"names") sapply(factors, function (name) { pos <- match(name,df.names) if (is.na(pos)) stop(paste(name,": no such column\n")) df[[pos]] <- factor(df[[pos]]) cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n")

sum(hist$density) == 2 ?!

2012 Mar 14

sum(hist$density) == 2 ?!

> x <- rnorm(1000) > h <- hist(x,plot=FALSE) > sum(h$density) [1] 2 ----------------------------- shouldn't it be 1?! > h <- hist(x,plot=FALSE, breaks=(-4:4)) > sum(h$density) [1] 1 ----------------------------- now it's 1. why?! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000 http://www.childpsy.net/ http://www.memritv.org

igraph: decompose.graph: Error: protect(): protection stack overflow

2012 Mar 20

igraph: decompose.graph: Error: protect(): protection stack overflow

I just got this error: > library(igraph) > comp <- decompose.graph(gr) Error: protect(): protection stack overflow Error: protect(): protection stack overflow > what can I do? the digraph is, indeed, large (300,000 vertexes), but there are very many very small components (which I would rather not discard). PS. the doc for decompose.graph does not say which mode is the default. --

cor() on sets of vectors

2012 Feb 23

cor() on sets of vectors

suppose I have two sets of vectors: x1,x2,...,xN and y1,y2,...,yN. I want N correlations: cor(x1,y1), cor(x2,y2), ..., cor(xN,yN). my sets of vectors are arranged as data frames x & y (vector=column): x <- data.frame(a=rnorm(10),b=rnorm(10),c=rnorm(10)) y <- data.frame(d=rnorm(10),e=rnorm(10),f=rnorm(10)) cor(x,y) returns a _matrix_ of all pairwise correlations: cor(x,y)

a merge() problem

2012 Oct 07

a merge() problem

I know it does not look very good - using the same column names to mean different things in different data frames, but here you go: --8<---------------cut here---------------start------------->8--- > x <- data.frame(a=c(1,2,3),b=c(4,5,6)) > y <- data.frame(b=c(1,2),a=c("a","b")) >

"unsparse" a vector

2012 Feb 08

"unsparse" a vector

Suppose I have a vector of strings: c("A1B2","A3C4","B5","C6A7B8") [1] "A1B2" "A3C4" "B5" "C6A7B8" where each string is a sequence of <column><value> pairs (fixed width, in this example both value and name are 1 character, in reality the column name is 6 chars and value is 2 digits). I need to

select rows with identical columns from a data frame

2013 Jan 18

select rows with identical columns from a data frame

I have a data frame with several columns. I want to select the rows with no NAs (as with complete.cases) and all columns identical. E.g., for --8<---------------cut here---------------start------------->8--- > f <- data.frame(a=c(1,NA,NA,4),b=c(1,NA,3,40),c=c(1,NA,5,40)) > f a b c 1 1 1 1 2 NA NA NA 3 NA 3 5 4 4 40 40 --8<---------------cut

apply --> data.frame

2012 Aug 30

apply --> data.frame

Is there a way for an apply-type function to return a data frame? the closest thing I think of is foo <- as.data.frame(sapply(...)) names(foo) <- c(....) is there a more "elegant" way? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org http://dhimmi.com http://honestreporting.com

naiveBayes: slow predict, weird results

2012 Feb 10

naiveBayes: slow predict, weird results

I did this: nb <- naiveBayes(users, platform) pl <- predict(nb,users) nrow(users) ==> 314781 ncol(users) ==> 109 1. naiveBayes() was quite fast (~20 seconds), while predict() was slow (tens of minutes). why? 2. the predict results were completely off the mark (quite the opposite of the expected overfitting). suffice it to show the tables: pl: android blackberry ipad

qqnorm & huge datasets

2011 Dec 21

qqnorm & huge datasets

Hi, When qqnorm on a vector of length 10M+ I get a huge pdf file which cannot be loaded by acroread or evince. Any suggestions? (apart from sampling the data). Thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000 http://mideasttruth.com http://honestreporting.com http://camera.org http://openvotingconsortium.org http://pmw.org.il

strsplit with a vector split argument

2013 Sep 18

strsplit with a vector split argument

Hi, I find this behavior unexpected: --8<---------------cut here---------------start------------->8--- > strsplit(c("a,b;c","d;e,f"),c(",",";")) [[1]] [1] "a" "b;c" [[2]] [1] "d" "e,f" --8<---------------cut here---------------end--------------->8--- I thought that it should be identical to this:

per-vertex statistics of edge weights

2012 Aug 15

per-vertex statistics of edge weights

I have a graph with edge and vertex weights, stored in two data frames: --8<---------------cut here---------------start------------->8--- vertices <- data.frame(vertex=c("a","b","c","d"),weight=c(1,2,1,3)) edges <-

non-consing count

2013 Jan 04

non-consing count

Hi, to count vector elements with some property, the standard idiom seems to be length(which): --8<---------------cut here---------------start------------->8--- x <- c(1,1,0,0,0) count.0 <- length(which(x == 0)) --8<---------------cut here---------------end--------------->8--- however, this approach allocates and discards 2 vectors: a logical vector of length=length(x) and an

extract fixed width fields from a string

2012 Jan 20

extract fixed width fields from a string

Hi, I have a data frame with one column containing string of the form "ABC...|XYZ..." where ABC etc are fields of 6 alphanumeric characters each and XYZ etc are fields of 8 alphanumeric characters each; "|" is a mandatory separator; I do not know in advance how many fields of each kind will each row contain. I need to extract these fields from the string. === How do I do that?

when to use `which'?

2011 Jul 12

when to use `which'?

when do I need to use which()? > a <- c(1,2,3,4,5,6) > a [1] 1 2 3 4 5 6 > a[a==4] [1] 4 > a[which(a==4)] [1] 4 > which(a==4) [1] 4 > a[which(a>2)] [1] 3 4 5 6 > a[a>2] [1] 3 4 5 6 > seems unnecessary... -- Sam Steingold (http://sds.podval.org/) on CentOS release 5.6 (Final) X 11.0.60900031 http://jihadwatch.org http://palestinefacts.org http://mideasttruth.com

count.fields inconsistent with read.table?

2012 Feb 24

count.fields inconsistent with read.table?

Hi, batch is a vector of lines returned by readLines from a NL-line-terminated file, here is the relevant section: ========================================================= AA BB CC DD EE FF GG H H JJ KK LL MM ========================================================= as you can see, a line is corrupt; two CRLF's are inserted. This is okay, I drop the bad lines, at least I hope I do:

plot means ?

2011 Jul 11

plot means ?

Hi, I need this plot: given: x,y - numerical vectors of length N plot xi vs mean(yj such that |xj - xi|<epsilon) (running mean?) alternatively, discretize X as if for histogram plotting and plot mean y over the center of the histogram group. is there a simple way? thanks! -- Sam Steingold (http://sds.podval.org/) on CentOS release 5.6 (Final) X 11.0.60900031 http://thereligionofpeace.com

please comment on my function

2012 Sep 14

please comment on my function

this function is supposed to canonicalize the language: --8<---------------cut here---------------start------------->8--- canonicalize.language <- function (s) { s <- tolower(s) long <- nchar(s) == 5 s[long] <- sub("^([a-z]{2})[-_][a-z]{2}$","\\1",s[long]) s[nchar(s) != 2 & s != "c"] <- "unknown" s }

generated list element names

2012 Nov 19

generated list element names

How can I create lists with element names created on the fly? --8<---------------cut here---------------start------------->8--- > list (foo = 10) $foo [1] 10 > list ("foo" = 10) $foo [1] 10 > list (paste("f","oo",sep="") = 10) Error: unexpected '=' in "list (paste("f","oo",sep="") ="

similar to: plot with a regression line(s)