Jens Oehlschlägel
2009-Nov-02 13:15 UTC
[Rd] two small wishes (with code sugegstions) for R-core
Dear R developers, It would be great if you could implement the two minor code changes suggested below, which would help processing large objects in R. Jens Oehlschl?gel # Wish no. 1: let [.AsIs return the class AFTER subsetting, not the class of the original object # Wish no. 2: adjust write.csv and write.csv2 for multiple calls in chunked writing # Rationale no. 1: a couple of packages will return a different class than SomeClass when subsetting with [.SomeClass # and still need to keep the AsIs property # Examples for classes returning different classes on subscipting are in packages 'bit', 'ff', 'bigmemory' # For classes where [.SomeClass will return class SomeClass, such a change will not hurt # Code suggestion no. 1: please use "[.AsIs" <- function (x, i, ...){ ret <- NextMethod("[") oldClass(ret) <- c("AsIs", oldClass(ret)) ret } # instead of "[.AsIs" <- function (x, i, ...) structure(NextMethod("["), class = class(x)) # Rationale no. 2: write.csv and write.csv2 currently enforce that a header must be written, even with append=TRUE # This prevents a csv file being written in chunks. # If argument append=TRUE is used, a header should not be enforced (may be even be forbidden) # Code suggestion no. 2: please use write.csv <- function (...) { Call <- match.call(write.table, expand.dots = TRUE) for (argname in c("col.names", "sep", "dec", "qmethod")) if (!is.null(Call[[argname]])) warning(gettextf("attempt to set '%s' ignored", argname), domain = NA) rn <- eval.parent(Call$row.names) ap <- eval.parent(Call$append) Call$col.names <- if (is.logical(ap) && ap) FALSE else {if (is.logical(rn) && !rn) TRUE else NA} Call$sep <- "," Call$dec <- "." Call$qmethod <- "double" Call[[1L]] <- as.name("write.table") eval.parent(Call) } write.csv2 <- function (...) { Call <- match.call(write.table, expand.dots = TRUE) for (argname in c("col.names", "sep", "dec", "qmethod")) if (!is.null(Call[[argname]])) warning(gettextf("attempt to set '%s' ignored", argname), domain = NA) rn <- eval.parent(Call$row.names) ap <- eval.parent(Call$append) Call$col.names <- if (is.logical(ap) && ap) FALSE else {if (is.logical(rn) && !rn) TRUE else NA} Call$sep <- ";" Call$dec <- "," Call$qmethod <- "double" Call[[1L]] <- as.name("write.table") eval.parent(Call) } # instead of write.csv <- function (...) { Call <- match.call(expand.dots = TRUE) for (argname in c("col.names", "sep", "dec", "qmethod")) if (!is.null(Call[[argname]])) warning(gettextf("attempt to set '%s' ignored", argname), domain = NA) rn <- eval.parent(Call$row.names) Call$col.names <- if (is.logical(rn) && !rn) TRUE else NA Call$sep <- "," Call$dec <- "." Call$qmethod <- "double" Call[[1L]] <- as.name("write.table") eval.parent(Call) } write.csv2 <- function (...) { Call <- match.call(expand.dots = TRUE) for (argname in c("col.names", "sep", "dec", "qmethod")) if (!is.null(Call[[argname]])) warning(gettextf("attempt to set '%s' ignored", argname), domain = NA) rn <- eval.parent(Call$row.names) Call$col.names <- if (is.logical(rn) && !rn) TRUE else NA Call$sep <- ";" Call$dec <- "," Call$qmethod <- "double" Call[[1L]] <- as.name("write.table") eval.parent(Call) }
Prof Brian Ripley
2009-Nov-18 09:02 UTC
[Rd] two small wishes (with code sugegstions) for R-core
I've incorporated a more correct version of 1: think about what happens if the next method also copies over the class, for example. (It was more complicated than that: [.AsIs was used for other classes in R -- did you actually test your suggestion, as 'make check' failed for me?) But write.csv[2] are just wrappers, and if you want to do something complicated simply don't use them. They are intended to help naive users of .csv files get them right, and banning append=TRUE seems the best way to help do that. (A recent R-help thread shows that non-users of Excel are often unaware of its exact requirements -- and that includes me.) The wishes may be small: the work to implement and test them is often not. On Mon, 2 Nov 2009, Jens Oehlschl?gel wrote:> Dear R developers, > > It would be great if you could implement the two minor code changes suggested below, which would help processing large objects in R. > > > Jens Oehlschl?gel > > > # Wish no. 1: let [.AsIs return the class AFTER subsetting, not the class of the original object > # Wish no. 2: adjust write.csv and write.csv2 for multiple calls in chunked writing > > # Rationale no. 1: a couple of packages will return a different class than SomeClass when subsetting with [.SomeClass > # and still need to keep the AsIs property > # Examples for classes returning different classes on subscipting are in packages 'bit', 'ff', 'bigmemory' > # For classes where [.SomeClass will return class SomeClass, such a change will not hurt > > # Code suggestion no. 1: please use > "[.AsIs" <- function (x, i, ...){ > ret <- NextMethod("[") > oldClass(ret) <- c("AsIs", oldClass(ret)) > ret > } > # instead of > "[.AsIs" <- function (x, i, ...) > structure(NextMethod("["), class = class(x)) > > > # Rationale no. 2: write.csv and write.csv2 currently enforce that a header must be written, even with append=TRUE > # This prevents a csv file being written in chunks. > # If argument append=TRUE is used, a header should not be enforced (may be even be forbidden) > > # Code suggestion no. 2: please use > write.csv <- > function (...) > { > Call <- match.call(write.table, expand.dots = TRUE) > for (argname in c("col.names", "sep", "dec", "qmethod")) if (!is.null(Call[[argname]])) > warning(gettextf("attempt to set '%s' ignored", argname), > domain = NA) > rn <- eval.parent(Call$row.names) > ap <- eval.parent(Call$append) > Call$col.names <- if (is.logical(ap) && ap) FALSE else {if (is.logical(rn) && !rn) TRUE else NA} > Call$sep <- "," > Call$dec <- "." > Call$qmethod <- "double" > Call[[1L]] <- as.name("write.table") > eval.parent(Call) > } > write.csv2 <- > function (...) > { > Call <- match.call(write.table, expand.dots = TRUE) > for (argname in c("col.names", "sep", "dec", "qmethod")) if (!is.null(Call[[argname]])) > warning(gettextf("attempt to set '%s' ignored", argname), > domain = NA) > rn <- eval.parent(Call$row.names) > ap <- eval.parent(Call$append) > Call$col.names <- if (is.logical(ap) && ap) FALSE else {if (is.logical(rn) && !rn) TRUE else NA} > Call$sep <- ";" > Call$dec <- "," > Call$qmethod <- "double" > Call[[1L]] <- as.name("write.table") > eval.parent(Call) > } > # instead of > write.csv <- function (...) > { > Call <- match.call(expand.dots = TRUE) > for (argname in c("col.names", "sep", "dec", "qmethod")) if (!is.null(Call[[argname]])) > warning(gettextf("attempt to set '%s' ignored", argname), > domain = NA) > rn <- eval.parent(Call$row.names) > Call$col.names <- if (is.logical(rn) && !rn) > TRUE > else NA > Call$sep <- "," > Call$dec <- "." > Call$qmethod <- "double" > Call[[1L]] <- as.name("write.table") > eval.parent(Call) > } > write.csv2 <- > function (...) > { > Call <- match.call(expand.dots = TRUE) > for (argname in c("col.names", "sep", "dec", "qmethod")) if (!is.null(Call[[argname]])) > warning(gettextf("attempt to set '%s' ignored", argname), > domain = NA) > rn <- eval.parent(Call$row.names) > Call$col.names <- if (is.logical(rn) && !rn) > TRUE > else NA > Call$sep <- ";" > Call$dec <- "," > Call$qmethod <- "double" > Call[[1L]] <- as.name("write.table") > eval.parent(Call) > } > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595