Anthony Damico
2012-Sep-14 16:30 UTC
[R] Any way to get read.table.ffdf() (in the ff package) to pass colClasses or comment.char parameters through to read.fwf() ?
Hi everyone, my apologies if I'm overlooking something obvious in the documentation. I'm relatively inexperienced with the (awesome) ff package. My goal is to use the read.table.ffdf() function to call the read.fwf() function and pass through the colClasses and comment.char arguments. The code below shows exactly what doesn't work for me. If the colClasses and comment.char parameters cannot be passed to read.fwf() through read.table.ffdf(), I'd love to hear any ideas for a workaround? :) Passing the comment.char parameter isn't critical for what I'm doing, so long as read.table.ffdf() is in fact passing comment.char = '' through to read.fwf(), as stated in (5) of the Details section of the documentation<http://www.inside-r.org/packages/cran/ff/docs/read.table.ffdf>. I do not want comment.char = "#" -- the default for read.table(). I'm using R 2.15.1 x86_64-pc-mingw32/x64 (64-bit) on Windows 7. Thanks!! Anthony Damico Kaiser Family Foundation library(ff) # create a simple temporary file.. fwffile <- tempfile() # steal some example code from the documentation -- cat(file=fwffile, "123456", "987654", sep="\n") x <- read.fwf(fwffile, widths=c(1,2,3)) #> 1 23 456 \ 9 87 654 y <- read.table.ffdf(file=fwffile, FUN="read.fwf", widths=c(1,2,3)) stopifnot(identical(x, y[,])) # the above block of code obviously works # then if i want to add the colClasses parameter, # read.fwf() works on its own u <- read.fwf(fwffile, widths=c(1,2,3) , colClasses = 'factor' ) #> 1 23 456 \ 9 87 654\ # but read.table.ffdf() does not. v <- read.table.ffdf(file=fwffile, FUN="read.fwf", widths=c(1,2,3) , colClasses = 'factor' ) # i'm confused why the line above doesn't work, since read.csv.ffdf() has no problem passing colClasses through to read.csv.. # as seen in this example (a slightly modified version of the documentation) x <- data.frame(log=rep(c(FALSE, TRUE), length.out=26), int=1:26, dbl=1:26 + 0.1, fac=factor(letters), ord=ordered(LETTERS), dct=Sys.time()+1:26, dat=seq(as.Date("1910/1/1"), length.out=26, by=1)) x <- x[c(13:1, 13:1),] csvfile <- tempPathFile(path=getOption("fftempdir"), extension="csv") write.csv(x, file=csvfile, row.names=FALSE) y <- read.csv(file=csvfile, header=TRUE, colClasses=c(dct="POSIXct", dat="Date")) # "ordered" gives an error ffx <- read.csv.ffdf(file=csvfile, header=TRUE, colClasses=c(dct="POSIXct", dat="Date")) identical( y , ffx[ , ] ) [[alternative HTML version deleted]]
Jan
2012-Sep-17 18:58 UTC
[R] Any way to get read.table.ffdf() (in the ff package) to pass colClasses or comment.char parameters through to read.fwf() ?
Hi Anthony, You are right, read.table.ffdf does not handle additional arguments passed on to read.table in the method read.fwf as you expect. read.table.ffdf checks the arguments of read.fwf and colClasses is not one of them, colClasses is part of ... which is passed on to read.table. You should report this to the package author. A way on how to circumvent it is by using the following code after which read.fwf will work as you expect it with arguments comment.char and colClasses. read.fwf.default <- get("read.fwf") read.fwf <- function(file, widths, header = FALSE, sep = "\t", skip = 0, row.names, col.names, n = -1, buffersize = 2000, comment.char = "#", colClasses = NA, ...){ read.fwf.default(file=file, widths=widths, header=header, sep=sep, skip=skip, row.names=row.names, col.names=col.names, n=n, buffersize=buffersize, comment.char = comment.char, colClasses = colClasses, ...) } v <- read.table.ffdf(file=fwffile, FUN="read.fwf", widths=c(1,2,3), colClasses = 'factor') hope this helps, Jan -- View this message in context: http://r.789695.n4.nabble.com/Any-way-to-get-read-table-ffdf-in-the-ff-package-to-pass-colClasses-or-comment-char-parameters-throu-tp4643171p4643413.html Sent from the R help mailing list archive at Nabble.com.