Anthony Damico
2012-Sep-14 16:30 UTC
[R] Any way to get read.table.ffdf() (in the ff package) to pass colClasses or comment.char parameters through to read.fwf() ?
Hi everyone, my apologies if I'm overlooking something obvious in the
documentation. I'm relatively inexperienced with the (awesome) ff package.
My goal is to use the read.table.ffdf() function to call the read.fwf()
function and pass through the colClasses and comment.char arguments. The
code below shows exactly what doesn't work for me.
If the colClasses and comment.char parameters cannot be passed to
read.fwf() through read.table.ffdf(), I'd love to hear any ideas for a
workaround? :)
Passing the comment.char parameter isn't critical for what I'm doing, so
long as read.table.ffdf() is in fact passing comment.char = '' through
to
read.fwf(), as stated in (5) of the Details section of the
documentation<http://www.inside-r.org/packages/cran/ff/docs/read.table.ffdf>.
I do not want comment.char = "#" -- the default for read.table().
I'm using R 2.15.1 x86_64-pc-mingw32/x64 (64-bit) on Windows 7.
Thanks!!
Anthony Damico
Kaiser Family Foundation
library(ff)
# create a simple temporary file..
fwffile <- tempfile()
# steal some example code from the documentation --
cat(file=fwffile, "123456", "987654", sep="\n")
x <- read.fwf(fwffile, widths=c(1,2,3)) #> 1 23 456 \ 9 87 654
y <- read.table.ffdf(file=fwffile, FUN="read.fwf", widths=c(1,2,3))
stopifnot(identical(x, y[,]))
# the above block of code obviously works
# then if i want to add the colClasses parameter,
# read.fwf() works on its own
u <- read.fwf(fwffile, widths=c(1,2,3) , colClasses = 'factor' )
#> 1 23
456 \ 9 87 654\
# but read.table.ffdf() does not.
v <- read.table.ffdf(file=fwffile, FUN="read.fwf", widths=c(1,2,3)
,
colClasses = 'factor' )
# i'm confused why the line above doesn't work, since read.csv.ffdf()
has
no problem passing colClasses through to read.csv..
# as seen in this example (a slightly modified version of the documentation)
x <- data.frame(log=rep(c(FALSE, TRUE), length.out=26), int=1:26, dbl=1:26
+ 0.1, fac=factor(letters), ord=ordered(LETTERS), dct=Sys.time()+1:26,
dat=seq(as.Date("1910/1/1"), length.out=26, by=1))
x <- x[c(13:1, 13:1),]
csvfile <- tempPathFile(path=getOption("fftempdir"),
extension="csv")
write.csv(x, file=csvfile, row.names=FALSE)
y <- read.csv(file=csvfile, header=TRUE,
colClasses=c(dct="POSIXct",
dat="Date")) # "ordered" gives an error
ffx <- read.csv.ffdf(file=csvfile, header=TRUE,
colClasses=c(dct="POSIXct",
dat="Date"))
identical( y , ffx[ , ] )
[[alternative HTML version deleted]]
Jan
2012-Sep-17 18:58 UTC
[R] Any way to get read.table.ffdf() (in the ff package) to pass colClasses or comment.char parameters through to read.fwf() ?
Hi Anthony,
You are right, read.table.ffdf does not handle additional arguments passed
on to read.table in the method read.fwf as you expect. read.table.ffdf
checks the arguments of read.fwf and colClasses is not one of them,
colClasses is part of ... which is passed on to read.table.
You should report this to the package author. A way on how to circumvent it
is by using the following code after which read.fwf will work as you expect
it with arguments comment.char and colClasses.
read.fwf.default <- get("read.fwf")
read.fwf <- function(file, widths, header = FALSE, sep = "\t", skip
= 0,
row.names, col.names, n = -1, buffersize = 2000, comment.char = "#",
colClasses = NA, ...){
read.fwf.default(file=file, widths=widths, header=header, sep=sep,
skip=skip, row.names=row.names, col.names=col.names, n=n,
buffersize=buffersize, comment.char = comment.char, colClasses = colClasses,
...)
}
v <- read.table.ffdf(file=fwffile, FUN="read.fwf", widths=c(1,2,3),
colClasses = 'factor')
hope this helps,
Jan
--
View this message in context:
http://r.789695.n4.nabble.com/Any-way-to-get-read-table-ffdf-in-the-ff-package-to-pass-colClasses-or-comment-char-parameters-throu-tp4643171p4643413.html
Sent from the R help mailing list archive at Nabble.com.