I get this error from read.table(): Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 234 did not have 8 elements The error is genuine (an extra field separator between 1st and 2nd element). 1. is there a way to see this bad line 234 from R without diving into the file? 2. is there a way to ignore the bad lines and get the data from the good lines only (I do want to see the bad lines, but I don't want to stop all work until some issue which causes 1% of data is resolved). thanks. Oh, yeah, a reproducible example: read.csv from ====a,b 1,2 3,4 5,,6 7,8 ====I want to be able to extract the data frame a b 1 1 1 2 3 4 3 7 8 and a list of strings of length 1 containing "5,,6". -- Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000 http://mideasttruth.com http://ffii.org http://honestreporting.com http://iris.org.il http://palestinefacts.org http://dhimmi.com If a cat tells you that you lost your mind, then it is so.
On 24/01/2012 3:45 PM, Sam Steingold wrote:> I get this error from read.table(): > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > line 234 did not have 8 elements > The error is genuine (an extra field separator between 1st and 2nd element). > > 1. is there a way to see this bad line 234 from R without diving into the file?You could use readLines. Skip 233 lines, read one.> 2. is there a way to ignore the bad lines and get the data from the good > lines only (I do want to see the bad lines, but I don't want to stop all > work until some issue which causes 1% of data is resolved).I think you would have to read the first part up to line 233, then read the part after line 234, then use rbind to join the two parts. The latter might be tricky if you need a header line; it may be easiest to rewrite the file to a tempfile(). Duncan Murdoch> thanks. > > Oh, yeah, a reproducible example: > > read.csv from > ====> a,b > 1,2 > 3,4 > 5,,6 > 7,8 > ====> I want to be able to extract the data frame > a b > 1 1 1 > 2 3 4 > 3 7 8 > > and a list of strings of length 1 containing "5,,6". >
On 25/01/12 09:45, Sam Steingold wrote:> I get this error from read.table(): > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > line 234 did not have 8 elements > The error is genuine (an extra field separator between 1st and 2nd element). > > 1. is there a way to see this bad line 234 from R without diving into the file? > > 2. is there a way to ignore the bad lines and get the data from the good > lines only (I do want to see the bad lines, but I don't want to stop all > work until some issue which causes 1% of data is resolved). > > thanks. > > Oh, yeah, a reproducible example: > > read.csv from > ====> a,b > 1,2 > 3,4 > 5,,6 > 7,8 > ====> I want to be able to extract the data frame > a b > 1 1 1 > 2 3 4 > 3 7 8 > > and a list of strings of length 1 containing "5,,6".Try: xxx <- readLines("<filename>") hhh <- read.csv(textConnection(xxx[1]),header=FALSE) yyy <- hhh[-1,] names(yyy) <- hhh[1,] bad <- list() j <- 0 for(i in 2:length(xxx)) { tmp <- read.csv(textConnection(xxx[i]),header=FALSE) if(ncol(tmp)==ncol(yyy)) yyy <- rbind(yyy,tmp) else { j <- j+1 bad[[j]] <- tmp } } closeAllConnections() HTH cheers, Rolf Turner