On Sat, Apr 21, 2012 at 3:28 PM, Max Kuhn <mxkuhn at gmail.com>
wrote:> For a package, I need to write a csv version of a data set to an R
> object. Right now, I use:
>
> ? ?out <- capture.output(
> ? ? ? ? ? ? ? ? ? ? ? ? ?write.table(x,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?sep = ",",
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?na = "?",
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?file = "",
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?quote = FALSE,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?row.names = FALSE,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?col.names = FALSE))
>
> To me, this is fairly slow; 131 seconds for a data frame with 8100
> rows and 1400 columns.
>
> The data will be in a data frame; I know write.table() would be faster
> with a matrix. I was looking into converting the data frame to a
> character matrix using as.matrix() or, better yet, format() prior to
> the call above. However, I'm not sure what an appropriate value of
> 'digits' should be so that the character version of numeric data
has
> acceptable fidelity.
>
> I also tried using a text connection and sink() as shown in
> ?textConnection but there was no speedup.
>
You could try a loop over each row, and use 'paste' to join each
element in a row by commas. Then use 'paste' again to join everything
you've got (a vector of rows) by a '\n' character.
something like:
paste(apply(x,1,paste,collapse=","),collapse="\n") #
untested
you probably also want to stick a final \n on it.
Is it faster? I don't know!
Barry