Romain François
2011-Nov-21 09:59 UTC
[Rd] extending the colClasses argument in read.table
Hello, We've released the int64 package to CRAN a few days ago. The package provides S4 classes "int64" and "uint64" that represent signed and unsigned 64 bit integer vectors. One further development of the package is to facilitate reading 64 bit integer data from csv, etc ... files. I have this function that wraps a call to read.csv to: - read the "int64" and "uint64" columns as "character" - converts them afterwards to the appropriate type read.csv.int64 <- function (file, ...){ dots <- list( file, ... ) if( "colClasses" %in% names(dots) ){ colClasses <- dots[["colClasses"]] idx.int64 <- colClasses == "int64" idx.uint64 <- colClasses == "uint64" colClasses[ idx.int64 | idx.uint64 ] <- "character" dots[["colClasses" ]] <- colClasses df <- do.call( "read.csv", dots ) if( any( idx.int64 ) ){ df[ idx.int64 ] <- lapply( df[ idx.int64 ], as.int64 ) } if( any( idx.uint64 ) ){ df[ idx.uint64 ] <- lapply( df[ idx.uint64 ], as.uint64 ) } df } else { read.csv( file, ... ) } } I was wondering if it would make sense to extend the colClasses argument so that other package can provide drivers, so that we could let the users just use read.csv, read.table, etc ... Before I start digging into the internals of read.table, I wanted to have opinions about whether this would be a good idea, etc ... Best Regards, Romain -- Romain Francois Professional R Enthusiast http://romainfrancois.blog.free.fr
Gabor Grothendieck
2011-Nov-21 15:31 UTC
[Rd] extending the colClasses argument in read.table
2011/11/21 Romain Fran?ois <romain at r-enthusiasts.com>:> Hello, > > We've released the int64 package to CRAN a few days ago. The package > provides S4 classes "int64" and "uint64" that represent signed and unsigned > 64 bit integer vectors. > > One further development of the package is to facilitate reading 64 bit > integer data from csv, etc ... files. > > I have this function that wraps a call to read.csv to: > - read the "int64" and "uint64" columns as "character" > - converts them afterwards to the appropriate type >Try this:> library(int64) > Lines <- "A\n12\n" > > setAs("character", "int64", function(from) as.int64(from)) > > DF <- read.csv(textConnection(Lines), colClasses = "int64") > > str(DF)'data.frame': 1 obs. of 1 variable: $ A:Formal class 'int64' [package "int64"] with 2 slots .. ..@ .Data:List of 1 .. .. ..$ : int 0 12 .. ..@ NAMES: NULL To convince ourselves that its translating from character to int64:> setAs("character", "int64", function(from) {print(class(from)); as.int64(from)}) > DF <- read.csv(textConnection(Lines), colClasses = "int64")[1] "character" -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
Apparently Analagous Threads
- A couple of issues with colClasses/setAs
- #include <inttypes.h> missing for 64 bit integers.
- Re: [libnbd PATCH 6/8] states: Add nbd_pread_callback API
- [PATCH libnbd] lib: Remove cookie parameter from completion callbacks.
- Re: [PATCH libnbd 1/3] generator: Change Closure so it describes single callbacks.