martin at martincmartin.com
2008-Aug-20 20:45 UTC
[Rd] Improvements to write.arff (PR#12574)
Full_Name: Martin C. Martin Version: 2.7.1 OS: Ubuntu Submission from: (NULL) (75.150.115.86) The function write.arff, in the foreign library: - Can produce relation names with invalid characters - Doesn't use colnames() for attribute names when writing a matrix. Here's a better version: write.arff <- function (x, file, eol = "\n") { if (file == "") file <- stdout() else if (is.character(file)) { file <- file(file, "w") on.exit(close(file)) } if (!inherits(file, "connection")) stop("Argument 'file' must be a character string or connection.") if (!is.data.frame(x) && !is.matrix(x)) x <- data.frame(x) squote <- function(s) { ifelse(is.na(s), s, sprintf("'%s'", gsub("(['\\])", "\\\\\\1", s))) } text <- paste('@relation "', make.names(deparse(substitute(x))), '"', sep='') writeLines(text, file, sep = eol) for (name in colnames(x)) { text <- paste("@attribute", name) if (is.data.frame(x) && is.factor(x[[name]])) { lev <- squote(levels(x[[name]])) levels(x[[name]]) <- lev text <- paste(text, " {", paste(lev, collapse = ","), "}", sep = "") } else if (is.character(x[,name])) { text <- paste(text, "string") x[,name] <- squote((x[,name])) } else if (inherits(x[,name], "POSIXt")) { text <- paste(text, "date \"yyyy-MM-dd hh:mm:ss\"") x[,name] <- squote(format(x[,name])) } else { text <- paste(text, "numeric") } writeLines(text, file, sep = eol) } writeLines("@data", file) write.table(x, file = file, na = "?", sep = ",", eol = eol, quote = FALSE, row.names = FALSE, col.names = FALSE) }
Thank you, but please see the R FAQ and report facts. We need examples of what goes wrong, and we will also need to document the fixes (e.g. names get mangled). We do prefer patches against the actual sources (lots of comments have been missed in your version): the current source tarball for foreign is on CRAN. On Wed, 20 Aug 2008, martin at martincmartin.com wrote:> Full_Name: Martin C. Martin > Version: 2.7.1 > OS: Ubuntu > Submission from: (NULL) (75.150.115.86) > > > The function write.arff, in the foreign library: > > - Can produce relation names with invalid characters > - Doesn't use colnames() for attribute names when writing a matrix. > > Here's a better version: > > write.arff <- function (x, file, eol = "\n") > { > if (file == "") > file <- stdout() > else if (is.character(file)) { > file <- file(file, "w") > on.exit(close(file)) > } > if (!inherits(file, "connection")) > stop("Argument 'file' must be a character string or connection.") > if (!is.data.frame(x) && !is.matrix(x)) > x <- data.frame(x) > squote <- function(s) { > ifelse(is.na(s), s, sprintf("'%s'", gsub("(['\\])", "\\\\\\1", > s))) > } > text <- paste('@relation "', make.names(deparse(substitute(x))), '"', > sep='') > writeLines(text, file, sep = eol) > for (name in colnames(x)) { > text <- paste("@attribute", name) > if (is.data.frame(x) && is.factor(x[[name]])) { > lev <- squote(levels(x[[name]])) > levels(x[[name]]) <- lev > text <- paste(text, " {", paste(lev, collapse = ","), > "}", sep = "") > } > else if (is.character(x[,name])) { > text <- paste(text, "string") > x[,name] <- squote((x[,name])) > } > else if (inherits(x[,name], "POSIXt")) { > text <- paste(text, "date \"yyyy-MM-dd hh:mm:ss\"") > x[,name] <- squote(format(x[,name])) > } > else { > text <- paste(text, "numeric") > } > writeLines(text, file, sep = eol) > } > writeLines("@data", file) > write.table(x, file = file, na = "?", sep = ",", eol = eol, > quote = FALSE, row.names = FALSE, col.names = FALSE) > } > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595