Michael Chirico
2019-Aug-02 07:29 UTC
[Rd] bug: write.dcf converts hyphen in field name to period
write.dcf(list('my-field' = 1L), tmp <- tempfile()) cat(readLines(tmp)) # my.field: 1 However there's nothing wrong with hyphenated fields per the Debian standard: https://www.debian.org/doc/debian-policy/ch-controlfields.html And in fact we see them using hyphenated fields there, and indeed read.dcf handles this just fine: writeLines(gsub('.', '-', readLines(tmp), fixed = TRUE), tmp) read.dcf(tmp) # my-field # [1,] "1" The guilty line is as.data.frame: if(!is.data.frame(x)) x <- as.data.frame(x, stringsAsFactors = FALSE) For my case, simply adding check.names=FALSE to this call would solve the issue in my case, but I think not in general. Here's what I see in the standard:> The field name is composed of US-ASCII characters excluding controlcharacters, space, and colon (i.e., characters in the ranges U+0021 (!) through U+0039 (9), and U+003B (;) through U+007E (~), inclusive). Field names must not begin with the comment character (U+0023 #), nor with the hyphen character (U+002D -). This could be handled by an adjustment to the next line: nmx <- names(x) becomes nmx <- gsub('^[#-]', '', gsub('[^\U{0021}-\U{0039}\U{003B}-\U{007E}]', '.', names(x))) (Or maybe errors for having invalid names) Michael Chirico [[alternative HTML version deleted]]
Apparently Analagous Threads
- write.dcf does not quote as Debian would like it to (PR#12816)
- write.dcf/read.dcf cycle converts missing entry to "NA" (PR#9796)
- (PR#9796) write.dcf/read.dcf cycle converts missing entry
- Lack of final newline in write.dcf changes append usage
- Read.dcf with no newline ending: gzfile drops last line