thr3ads.net - R devel - [Rd] On read.csv and write.csv [Jul 2021]

If this information is useful, please help other people find it:
Share via:

Gabriel Becker

2021-Jul-01 22:29 UTC

[Rd] On read.csv and write.csv

On Thu, Jul 1, 2021 at 1:46 PM Stephen Ellison <S.Ellison at lgcgroup.com>
wrote:
>
> Please run the reproducible example provided.
> When you do, you will see that write.csv writes an unnecessary empty
> header field ("") over the row names column. This makes the
number of
> header fields equal to the number of columns _including_ row names. That
> causes the original row names to be read as data by read.csv, following the
> rule that the number of header fields determines whether row names are
> present. read.csv  accordingly assumes that the former row names are
> unnamed data, calls the unnamed row names column "X" (or X.1 etc
if X
> exists) and then adds new, default, row names _instead of the original row
> names written by write.csv_.
> That's not helpful.
>
This depends on if you are reading the csv via R or something else, I would
imagine. It not being "valid" CSV at all would likely cause some
programs
to choke entirely, I expect. I admit that's conjecture though, I don't
have
data on that one way or another.

~G

	[[alternative HTML version deleted]]

Simon Urbanek

2021-Jul-01 23:00 UTC

head link

[Rd] On read.csv and write.csv

Just for completeness, all this is well documented:

CSV files:

     By default there is no column name for a column of row names.  If
     ?col.names = NA? and ?row.names = TRUE? a blank column name is
     added, which is the convention used for CSV files to be read by
     spreadsheets.  Note that such CSV files can be read in R by

       read.csv(file = "<filename>", row.names = 1)

Cheers,
Simon


> On 2/07/2021, at 10:29 AM, Gabriel Becker <gabembecker at gmail.com>
wrote:
> 
> 
> 
> On Thu, Jul 1, 2021 at 1:46 PM Stephen Ellison <S.Ellison at
lgcgroup.com> wrote:
> 
> Please run the reproducible example provided. 
> When you do, you will see that write.csv writes an unnecessary empty header
field ("") over the row names column. This makes the number of header
fields equal to the number of columns _including_ row names. That causes the
original row names to be read as data by read.csv, following the rule that the
number of header fields determines whether row names are present. read.csv 
accordingly assumes that the former row names are unnamed data, calls the
unnamed row names column "X" (or X.1 etc if X exists) and then adds
new, default, row names _instead of the original row names written by
write.csv_.
> That's not helpful.
> 
> This depends on if you are reading the csv via R or something else, I would
imagine. It not being "valid" CSV at all would likely cause some
programs to choke entirely, I expect. I admit that's conjecture though, I
don't have data on that one way or another.
> 
> ~G

R devel - Jul 2021 - On read.csv and write.csv

[Rd] On read.csv and write.csv

[Rd] On read.csv and write.csv