Jens Oehlschlägel
2003-Nov-06 19:41 UTC
Summary: [R] How to represent pure linefeeds chr(10) under R for Windows
Thanks to all who have responded.
My concern was to be able to write a csv file that can have line feeds in
string columns chr(10).
Why? Excel allows line feeds chr(10) within cells and line breaks
chr(13)+chr(10) at line ending,
but the windows version of R automatically replaces \n by \r\n in writing
and \r\n by \n in reading (text mode).
The clues for a solution came from Brian Ripley and Thomas Lumley: we need
to use "binary" connection mode (will not replace \n by \r\n) and
explicit
specification of line ending as "\r\n".
Testing with these gave the following results:
## write.table / read.table: a bit inconsistent: need text connection to
read and binary connection to write
## writeLines / readLines: readLines misses a sep= parameter to properly
read in such data
## writeChar / readChar: OK
Thanks again and
Best regards
Jens Oehlsch?gel
## Details
filename <- "c:/tmp/c2.csv"
## write.table / read.table: a bit inconsistent: need binary connection to
read and text connection to write
data <- data.frame(a='c\nd', b='"???????"')
# writing in text mode replaces \n by \r\n
file <- file(filename, "w")
write.table(data, row.names=FALSE, file=file, sep=";",
qmethod="double")
close(file)
# writing in binary mode does not replace \n, however the real line endings
are also \n instead of \r\n
file <- file(filename, "wb")
write.table(data, row.names=FALSE, file=file, sep=";",
qmethod="double")
close(file)
# using the eol parameter we can create the desired csv format (which can be
read by Excel
file <- file(filename, "wb")
write.table(data, row.names=FALSE, file=file, sep=";",
qmethod="double",
eol="\r\n")
close(file)
# for the read test write a dataset that avoids a reported bug in
read.table()
data <- data.frame(a=c(rep("x", 5), "c\nd"),
b=c(rep("y", 5), '"???????"'))
file <- file(filename, "wb")
write.table(data, row.names=FALSE, file=file, sep=";",
qmethod="double",
eol="\r\n")
close(file)
# read astonishingly works on char mode connection
file <- file(filename, "r")
read.csv2(file)
close(file)
# and doesn't work on binary connection
file <- file(filename, "rb")
read.csv2(file)
close(file)
## writeLines / readLines: readLines misses a sep= parameter to properly
read in such data
data <- c('a;b', 'c\nd;"???????"')
# text mode substitutes \n -> \r\n like in write.table
file <- file(filename, "w")
writeLines(data, file, sep="\n")
close(file)
# we can write out the desired one using binary mode and sep="\r\n"
file <- file(filename, "wb")
writeLines(data, file, sep="\r\n")
close(file)
# However, we cannot read this in in binary mode, readLines misses a
sepparameter
file <- file(filename, "rb")
readLines(file)
close(file)
# text mode replaces as expected
file <- file(filename, "r")
readLines(file)
close(file)
## writeChar / readChar: OK
data <- c('a;b\r\nc\nd;"???????"')
# writing text mode substitutes as expected
file <- file(filename, "w")
writeChar(data, file, eos=NULL)
close(file)
# writing binary mode works
file <- file(filename, "wb")
writeChar(data, file, eos=NULL)
close(file)
# reading binary mode works
file <- file(filename, "rb")
readChar(file, nchar(data))
close(file)
# reading text mode substitutes as expected
file <- file(filename, "r")
readChar(file, nchar(data))
close(file)
--
Gabor Grothendieck
2003-Nov-06 20:33 UTC
Summary: [R] How to represent pure linefeeds chr(10) under R for Windows
Its also possible to avoid these intricacies by not
using an intermediate text representation, i.e. csv,
in the first place.
The following R code uses the free dataload utility
(Google search for Baird dataload utility) to create
an .xls file from data frame, x:
save(x,file="x.rda")
system("dataload x.rda x.xls/u")
At this point you can read x.xls into Excel.
---
Date: Thu, 6 Nov 2003 20:41:16 +0100 (MET)
From: Jens =?ISO-8859-1?Q?Oehlschl=E4gel?= <joehl at gmx.de>
To: <r-help at stat.math.ethz.ch>
Subject: Summary: [R] How to represent pure linefeeds chr(10) under R for
Windows
Thanks to all who have responded.
My concern was to be able to write a csv file that can have line feeds in
string columns chr(10).
Why? Excel allows line feeds chr(10) within cells and line breaks
chr(13)+chr(10) at line ending,
but the windows version of R automatically replaces \n by \r\n in writing
and \r\n by \n in reading (text mode).
The clues for a solution came from Brian Ripley and Thomas Lumley: we need
to use "binary" connection mode (will not replace \n by \r\n) and
explicit
specification of line ending as "\r\n".
Testing with these gave the following results:
## write.table / read.table: a bit inconsistent: need text connection to
read and binary connection to write
## writeLines / readLines: readLines misses a sep= parameter to properly
read in such data
## writeChar / readChar: OK
Thanks again and
Best regards
Jens Oehlschägel
## Details
filename <- "c:/tmp/c2.csv"
## write.table / read.table: a bit inconsistent: need binary connection to
read and text connection to write
data <- data.frame(a='c\nd', b='"äöüÄÖÜß"')
# writing in text mode replaces \n by \r\n
file <- file(filename, "w")
write.table(data, row.names=FALSE, file=file, sep=";",
qmethod="double")
close(file)
# writing in binary mode does not replace \n, however the real line endings
are also \n instead of \r\n
file <- file(filename, "wb")
write.table(data, row.names=FALSE, file=file, sep=";",
qmethod="double")
close(file)
# using the eol parameter we can create the desired csv format (which can be
read by Excel
file <- file(filename, "wb")
write.table(data, row.names=FALSE, file=file, sep=";",
qmethod="double",
eol="\r\n")
close(file)
# for the read test write a dataset that avoids a reported bug in
read.table()
data <- data.frame(a=c(rep("x", 5), "c\nd"),
b=c(rep("y", 5), '"äöüÄÖÜß"'))
file <- file(filename, "wb")
write.table(data, row.names=FALSE, file=file, sep=";",
qmethod="double",
eol="\r\n")
close(file)
# read astonishingly works on char mode connection
file <- file(filename, "r")
read.csv2(file)
close(file)
# and doesn't work on binary connection
file <- file(filename, "rb")
read.csv2(file)
close(file)
## writeLines / readLines: readLines misses a sep= parameter to properly
read in such data
data <- c('a;b', 'c\nd;"äöüÄÖÜß"')
# text mode substitutes \n -> \r\n like in write.table
file <- file(filename, "w")
writeLines(data, file, sep="\n")
close(file)
# we can write out the desired one using binary mode and sep="\r\n"
file <- file(filename, "wb")
writeLines(data, file, sep="\r\n")
close(file)
# However, we cannot read this in in binary mode, readLines misses a
sepparameter
file <- file(filename, "rb")
readLines(file)
close(file)
# text mode replaces as expected
file <- file(filename, "r")
readLines(file)
close(file)
## writeChar / readChar: OK
data <- c('a;b\r\nc\nd;"äöüÄÖÜß"')
# writing text mode substitutes as expected
file <- file(filename, "w")
writeChar(data, file, eos=NULL)
close(file)
# writing binary mode works
file <- file(filename, "wb")
writeChar(data, file, eos=NULL)
close(file)
# reading binary mode works
file <- file(filename, "rb")
readChar(file, nchar(data))
close(file)
# reading text mode substitutes as expected
file <- file(filename, "r")
readChar(file, nchar(data))
close(file)
_______________________________________________
No banners. No pop-ups. No kidding.
Introducing My Way - http://www.myway.com
Gabor Grothendieck
2003-Nov-07 13:04 UTC
Summary: [R] How to represent pure linefeeds chr(10) under R for Windows
While I don't disagree with what you say, the purpose of this is to interface to Excel which is even less free (you have to pay for Excel but not for dataload) so perhaps the status of the glue used between R and Excel is not as important.>From an expediency viewpoint, I found that dataload solvesa wide variety of interfacing problems easily, typically in a single line of code, using a single tool and consistent syntax. I can translate easily among .rda, .xls, .csv, .txt and numerous other formats. --- Date: Fri, 7 Nov 2003 10:32:44 +0100 From: Martin Maechler <maechler at stat.math.ethz.ch> To: <ggrothendieck at myway.com> Cc: <joehl at gmx.de>, <r-help at stat.math.ethz.ch> Subject: Re: Summary: [R] How to represent pure linefeeds chr(10) under R for Windows>>>>> "Gabor" == Gabor Grothendieck <ggrothendieck at myway.com> >>>>> on Thu, 6 Nov 2003 15:33:04 -0500 (EST) writes:Gabor> Its also possible to avoid these intricacies by not Gabor> using an intermediate text representation, i.e. csv, Gabor> in the first place. Gabor> The following R code uses the free dataload utility Gabor> (Google search for Baird dataload utility) to create Gabor> an .xls file from data frame, x: Gabor> save(x,file="x.rda") Gabor> system("dataload x.rda x.xls/u") Gabor> At this point you can read x.xls into Excel. Note that this has two "problems" IMO, which Jens' R-only solution does not have: 1) dataload is *not* free software in the sense of the Free Software Foundation (which has existed for a much longer time than MS windows!): It's only "free" as in "free beer", not "free" as in "free speech" . For more, read the "Free as in Freedom" main link on http://www.fsf.org/ 2) dataload is only available as *binary* on *some* platforms, as opposed to R which is available to everyone working with it :-) Martin Maechler <maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <><