thr3ads.net - similar to: "read.table: how to ignore errors?"

Displaying 20 results from an estimated 7000 matches similar to: "read.table: how to ignore errors?"

count.fields inconsistent with read.table?

2012 Feb 24

count.fields inconsistent with read.table?

Hi, batch is a vector of lines returned by readLines from a NL-line-terminated file, here is the relevant section: ========================================================= AA BB CC DD EE FF GG H H JJ KK LL MM ========================================================= as you can see, a line is corrupt; two CRLF's are inserted. This is okay, I drop the bad lines, at least I hope I do:

strptime format = "%H:%M:%OS6"

2011 Feb 15

strptime format = "%H:%M:%OS6"

I read a dataset with times in them, e.g., "09:31:29.18761". I then parse them: > all$X.Time <- strptime(all$X.Time, format = "%H:%M:%OS6"); and get a vector of NAs (how do I check that except for a visual inspection?) then I do > options("digits.secs"=6); > all$X.Time <- strptime(all$X.Time, format = "%H:%M:%OS"); and it, apparently, works:

apply --> data.frame

2012 Aug 30

apply --> data.frame

Is there a way for an apply-type function to return a data frame? the closest thing I think of is foo <- as.data.frame(sapply(...)) names(foo) <- c(....) is there a more "elegant" way? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org http://dhimmi.com http://honestreporting.com

cannot read iso639 table

2012 Sep 13

cannot read iso639 table

line 109 did not have 5 elements ... but it did! empty beginning of file ... but it's not! details: --8<---------------cut here---------------start------------->8--- get.language.ISO.table <- function () { socket <- url("http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt", open="r",encoding="utf-8"); data <-

a merge() problem

2012 Oct 07

a merge() problem

I know it does not look very good - using the same column names to mean different things in different data frames, but here you go: --8<---------------cut here---------------start------------->8--- > x <- data.frame(a=c(1,2,3),b=c(4,5,6)) > y <- data.frame(b=c(1,2),a=c("a","b")) >

cannot turn some columns in a data frame into factors

2006 May 11

cannot turn some columns in a data frame into factors

Hi, I have a data frame df and a list of names of columns that I want to turn into factors: df.names <- attr(df,"names") sapply(factors, function (name) { pos <- match(name,df.names) if (is.na(pos)) stop(paste(name,": no such column\n")) df[[pos]] <- factor(df[[pos]]) cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n")

matrix.csr %*% matrix --> matrix

2012 Aug 27

matrix.csr %*% matrix --> matrix

When a sparse matrix is multiplied by a regular one, the result is usually not sparse. However, when matrix.csr is multiplied by a regular matrix in R, a matrix.csr is produced. Is there a way to avoid this? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://palestinefacts.org http://truepeace.org

where are these NAs coming from?

2012 Sep 19

where are these NAs coming from?

I see this: --8<---------------cut here---------------start------------->8--- > length(which(is.na(z$language))) [1] 0 > locals <- z[z$country == mycountry,] > length(which(is.na(locals$language))) [1] 229 --8<---------------cut here---------------end--------------->8--- where are those locals without the language coming from?! -- Sam Steingold (http://sds.podval.org/) on

qqnorm & huge datasets

2011 Dec 21

qqnorm & huge datasets

Hi, When qqnorm on a vector of length 10M+ I get a huge pdf file which cannot be loaded by acroread or evince. Any suggestions? (apart from sampling the data). Thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000 http://mideasttruth.com http://honestreporting.com http://camera.org http://openvotingconsortium.org http://pmw.org.il

removing NA from a data frame

2006 Mar 17

removing NA from a data frame

Hi, It appears that deal does not support missing values (NA), so I need to remove them (NAs) from my data frame. how do I do this? (I am very new to R, so a detailed step-by-step explanation with code samples would be nice). Some columns (variables) have quite a few NAs, so I would rather drop the whole column than sacrifice all the rows (observations) which have NA in that column. How do I

plot with a regression line(s)

2012 Apr 04

plot with a regression line(s)

I am sure a common need is to plot a scatterplot with some fitted line(s) and maybe save to a file. I have this: plot.glm <- function (x, y, file = NULL, xlab = deparse(substitute(x)), ylab = deparse(substitute(y)), main = NULL) { m <- glm(y ~ x) if (!is.null(file)) pdf(file = file) plot(x, y, xlab = xlab, ylab = ylab, main = main) lines(x, y =

all.equal: subscript out of bounds

2011 Feb 15

all.equal: subscript out of bounds

When I do > all(all$X.Time == all$Y.Time); [1] TRUE as expected, but > all.equal(all$X.Time,all$Y.Time); Error in target[[i]] : subscript out of bounds why? thanks! -- Sam Steingold (http://sds.podval.org/) on CentOS release 5.3 (Final) http://mideasttruth.com http://honestreporting.com http://dhimmi.com http://jihadwatch.org http://pmw.org.il http://ffii.org The dark past once was the

drop rare factors

2012 Jan 18

drop rare factors

I have a data frame with some factor columns. I want to drop the rows with rare factor values (and remove the factor values from the factors). E.g., frame$MyFactor takes values A 1,000 times, B 2,000 times, C 30 times and D 4 times. I want to remove all rows which assume rare values (<1%), i.e., C and D. i.e., frame <- frame[[! (frame$MyFactor %in% c("A","B"))]] except

[[]] confusion

2011 Feb 15

[[]] confusion

what does the output for [[]] mean here: > all$X.Time[5] [1] "2011-02-15 09:32:26.37222" > all$X.Time[[5]] [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 > all$X.Time[1] [1] "2011-02-15 09:31:29.18761" > all$X.Time[[1]] [1] 29.18761 34.30949 36.38144 12.28500 26.37222 47.00837 40.20271 32.83765 [9] 54.56998 28.56961 55.96641 28.91920 32.29962 10.94081 34.31731

sum(hist$density) == 2 ?!

2012 Mar 14

sum(hist$density) == 2 ?!

> x <- rnorm(1000) > h <- hist(x,plot=FALSE) > sum(h$density) [1] 2 ----------------------------- shouldn't it be 1?! > h <- hist(x,plot=FALSE, breaks=(-4:4)) > sum(h$density) [1] 1 ----------------------------- now it's 1. why?! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000 http://www.childpsy.net/ http://www.memritv.org

extract fixed width fields from a string

2012 Jan 20

extract fixed width fields from a string

Hi, I have a data frame with one column containing string of the form "ABC...|XYZ..." where ABC etc are fields of 6 alphanumeric characters each and XYZ etc are fields of 8 alphanumeric characters each; "|" is a mandatory separator; I do not know in advance how many fields of each kind will each row contain. I need to extract these fields from the string. === How do I do that?

iptables rules

2010 Mar 29

iptables rules

I've got a server with several ip's on eth0. I want to block all traffic *except* to port 80 on them, but not on any other IPs, so that eth0 is www.xxx.yyy.zzz eth0:1 is www.xxx.yyy.ggg eth0:2 is www.xxx.yyy.hhh I've tried -A RH-Firewall-1-INPUT -p tcp -d www.xxx.yyy.ggg --dport ! 80 -j DROP -A RH-Firewall-1-INPUT -p tcp -d www.xxx.yyy.hhh --dport ! 80 -j DROP and restarted (and

drop zero slots from table?

2012 Sep 19

drop zero slots from table?

I find myself doing --8<---------------cut here---------------start------------->8--- tab <- table(...) tab <- tab[tab > 0] tab <- sort(tab,decreasing=TRUE) --8<---------------cut here---------------end--------------->8--- all the time. I am wondering if the "drop 0" (and maybe even sort?) can be effected by some magic argument to table() which I fail to discover

not supressing leading zeros when reading a table?

2005 Jul 10

not supressing leading zeros when reading a table?

Dear R list, I have a dataset with a column which should be read as character, like this: name surname answer 1 xx yyy "00100" 2 rrr hhh "01" When reading this dataset with read.table, I get 1 xx yyy 100 2 rrr hhh 1 The string column consists in answers to multiple choice questions, not all having the same number of answers. I could format the

list to matrix?

2012 Dec 04

list to matrix?

How do I convert a list to a matrix? --8<---------------cut here---------------start------------->8--- list(c(50000, 101), c(1e+05, 46), c(150000, 31), c(2e+05, 17), c(250000, 19), c(3e+05, 11), c(350000, 12), c(4e+05, 25), c(450000, 19), c(5e+05, 16)) as.matrix(a) [,1] [1,] Numeric,2 [2,] Numeric,2 [3,] Numeric,2 [4,] Numeric,2 [5,] Numeric,2 [6,] Numeric,2 [7,]

similar to: read.table: how to ignore errors?