There's an as.matrix() call in write.table that means the formatting of numeric columns changes depending on whether there are any non-numeric columns in the table or not. For example,> x <- data.frame(a=1:10,b=1:10) > write.table(x,sep=',',row.names=F)"a","b" 1,1 2,2 3,3 4,4 5,5 6,6 7,7 8,8 9,9 10,10> x <- data.frame(a=1:10,b=as.factor(1:10)) > write.table(x,sep=',',row.names=F)"a","b" 1,"1" 2,"2" 3,"3" 4,"4" 5,"5" 6,"6" 7,"7" 8,"8" 9,"9" 10,"10" Notice that column "a" has a leading space in the second example, but not the first. Normally this won't matter, but RSQLite uses write.table in the sqliteWriteTable function, which would cause column "a" above to be treated as text rather than numeric, and be sorted as text rather than into numerical order. Duncan Murdoch
>>>>> "Duncan" == Duncan Murdoch <murdoch@stats.uwo.ca> >>>>> on Sat, 4 Dec 2004 01:55:26 +0100 (CET) writes:Duncan> There's an as.matrix() call in write.table that means the formatting Duncan> of numeric columns changes depending on whether there are any Duncan> non-numeric columns in the table or not. yes, I think I had seen this (a while ago in the source code) and then wondered if one shouldn't have used data.matrix() instead of as.matrix() - something I actually do advocate more generally, as "good programming style". It also does solve the problem in the example here -- HOWEVER, the lines *before* as.matrix() have ## as.matrix might turn integer or numeric columns into a complex matrix cmplx <- sapply(x, is.complex) if(any(cmplx) && !all(cmplx)) x[cmplx] <- lapply(x[cmplx], as.character) x <- as.matrix(x) which makes you see that write.table(.) should also work when the data frame has complex variables {or some other kinds of non-numeric as you've said above} -- something which data.matrix() can't handle.... As soon as you have a complex or a character variable (together with others) in your data.frame, as.matrix() will have to return "character" and apply format() to the numeric variables, as well... So, to make this consistent in your sense, i.e. formatting of a column shouldn't depend on the presence of other columns, we can't use as.matrix() nor data.matrix() but have to basically replicate an altered version of as.matrix inside write.table. I propose to do this, but expose the altered version as something like as.charMatrix(.) and replace the 4 lines {of code in write.table()} above by the single line as.charMatrix(x) -- Martin Martin
On Sat, 4 Dec 2004 13:51:55 +0100, Martin Maechler <maechler@stat.math.ethz.ch> wrote:>>>>>> "Duncan" == Duncan Murdoch <murdoch@stats.uwo.ca> >>>>>> on Sat, 4 Dec 2004 01:55:26 +0100 (CET) writes: > > Duncan> There's an as.matrix() call in write.table that means the formatting > Duncan> of numeric columns changes depending on whether there are any > Duncan> non-numeric columns in the table or not. > >yes, I think I had seen this (a while ago in the source code) >and then wondered if one shouldn't have used > data.matrix() instead of as.matrix() >- something I actually do advocate more generally, as "good >programming style". It also does solve the problem in the >example here -- HOWEVER, the lines *before* as.matrix() have > > ## as.matrix might turn integer or numeric columns into a complex matrix > cmplx <- sapply(x, is.complex) > if(any(cmplx) && !all(cmplx)) x[cmplx] <- lapply(x[cmplx], as.character) > x <- as.matrix(x) > >which makes you see that write.table(.) should also work when >the data frame has complex variables {or some other kinds of >non-numeric as you've said above} -- something which >data.matrix() can't handle.... >As soon as you have a complex or a character variable (together >with others) in your data.frame, as.matrix() will have to >return "character" and apply format() to the numeric variables, as well... > >So, to make this consistent in your sense, i.e. formatting of a >column shouldn't depend on the presence of other columns, we >can't use as.matrix() nor data.matrix() but have to basically >replicate an altered version of as.matrix inside write.table. > >I propose to do this, but expose the altered version as >something like > as.charMatrix(.) > >and replace the 4 lines {of code in write.table()} above by the >single line > as.charMatrix(x)That sounds good. Which version of the formatting would you choose, leading spaces or not? My preference would be to leave off the leading spaces, in the belief that write.table is usually used for data storage rather than data display, but it is sometimes used for data display (e.g. in utils::upgrade.packageStatus, which would not be affected by your choice). Duncan Murdoch