similar to: R-2.14.0: read.csv2 with fileEncoding="UTF-8"

Displaying 20 results from an estimated 3000 matches similar to: "R-2.14.0: read.csv2 with fileEncoding="UTF-8""

2017 Apr 30
0
Any progress on write.csv fileEncoding for UTF-16 and UTF-32 ?
No, I don't think anyone is working on this. There's a fairly simple workaround for the UTF-16 and UTF-32 iconv issues: don't attempt to produce character vectors, produce raw vectors instead. (The "toRaw" argument to iconv() asks for this.) Raw vectors can contain embedded nulls. Character vectors can't, because internally, R is using 8 bit C strings, and the
2017 May 02
0
Any progress on write.csv fileEncoding for UTF-16 and UTF-32 ?
Thanks for looking into this. A few notes regarding all the UTF encodings on Windows 10 ... The default eol for write.csv (via write.table) is "\n" and always gives as.raw (c (0x0d, 0x0a)), that is, <Carriage Return> <Line Feed> as adjacent bytes. This is fine for UTF-8 but wrong for UTF-16 and UTF-32. EXAMPLE: Using UTF-32 for exaggeration (note also that 3 nul bytes are
2016 Feb 24
0
iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)
On 24.02.2016 15:47, Duncan Murdoch wrote: > On 23/02/2016 7:06 AM, Mikko Korpela wrote: >> On 23.02.2016 11:37, Martin Maechler wrote: >>>>>>>> nospam at altfeld-im de <nospam at altfeld-im.de> >>>>>>>> on Mon, 22 Feb 2016 18:45:59 +0100 writes: >>> >>> > Dear R developers >>> > I think
2016 Feb 23
0
iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)
On 23.02.2016 11:37, Martin Maechler wrote: >>>>>> nospam at altfeld-im de <nospam at altfeld-im.de> >>>>>> on Mon, 22 Feb 2016 18:45:59 +0100 writes: > > > Dear R developers > > I think I have found a bug that can be reproduced with two lines of code > > and I am very thankful to get your first assessment or feed-back
2016 Feb 24
0
iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)
On 24/02/2016 11:16 AM, Duncan Murdoch wrote: > On 24/02/2016 9:55 AM, Mikko Korpela wrote: >> On 24.02.2016 15:47, Duncan Murdoch wrote: >>> On 23/02/2016 7:06 AM, Mikko Korpela wrote: >>>> On 23.02.2016 11:37, Martin Maechler wrote: >>>>>>>>>> nospam at altfeld-im de <nospam at altfeld-im.de>
2016 Feb 25
0
iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)
On 25.02.2016 11:31, Mikko Korpela wrote: > On 23.02.2016 14:06, Mikko Korpela wrote: >> On 23.02.2016 11:37, Martin Maechler wrote: >>>>>>>> nospam at altfeld-im de <nospam at altfeld-im.de> >>>>>>>> on Mon, 22 Feb 2016 18:45:59 +0100 writes: >>> >>> > Dear R developers >>> > I think I have
2017 Apr 29
2
Any progress on write.csv fileEncoding for UTF-16 and UTF-32 ?
"R version 3.4.0 (2017-04-21)" on "x86_64-w64-mingw32" platform I am using CSVs and other text tables, and text in general (including regular expressions), on Windows 10. For me, that means dealing with Windows-1252 and UTF-8 encoding, with UTF-16 and UTF-32 as helpful curiosities. Something as simple as iconv ("\n", to = "UTF-16") causes an error, due to
2017 May 01
3
Any progress on write.csv fileEncoding for UTF-16 and UTF-32 ?
On 30/04/2017 12:23 PM, Duncan Murdoch wrote: > No, I don't think anyone is working on this. > > There's a fairly simple workaround for the UTF-16 and UTF-32 iconv > issues: don't attempt to produce character vectors, produce raw vectors > instead. (The "toRaw" argument to iconv() asks for this.) Raw vectors > can contain embedded nulls. Character vectors
2010 Jan 12
0
Reading a file with mixed cyrillic/latin characters
Dear useRs, I am trying to read a tab-delimited Unicode text file containing both latin and cyrillic characters and failing miserably. The file looks like this (I hope it comes across right): A B C 3 foo ??? 5 bar ??? read.table("foo.txt",sep="\t",header=TRUE) I am guessing that I can use the fileEncoding argument to read.table() to read this, but I can find no list of
2017 May 02
1
Any progress on write.csv fileEncoding for UTF-16 and UTF-32 ?
On 01/05/2017 8:49 PM, Jack Kelley wrote: > Thanks for looking into this. > > A few notes regarding all the UTF encodings on Windows 10 ... This all stems from the ancient bad decision by Microsoft to translate LF characters to CR LF when writing text files. R passes 0A or 0A 00 or 0A 00 00 00 to the output routine (part of the C run-time), and it needs to figure out how many
2016 Feb 22
0
iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)
Dear R developers I think I have found a bug that can be reproduced with two lines of code and I am very thankful to get your first assessment or feed-back on my report. If this is the wrong mailing list or I did something wrong (e. g. semi "anonymous" email address to protect my privacy and defend unwanted spam) please let me know since I am new here. Thank you very much :-) J.
2016 Feb 23
1
iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)
Excellent analysis, thank you both for the quick reply! Is there anything I can do to get the bug fixed in the next version of R (e. g. filing a bug report at https://bugs.r-project.org/bugzilla3/)? On Tue, 2016-02-23 at 14:06 +0200, Mikko Korpela wrote: > On 23.02.2016 11:37, Martin Maechler wrote: > >>>>>> nospam at altfeld-im de <nospam at altfeld-im.de> >
2016 Feb 24
2
iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)
On 24/02/2016 9:55 AM, Mikko Korpela wrote: > On 24.02.2016 15:47, Duncan Murdoch wrote: >> On 23/02/2016 7:06 AM, Mikko Korpela wrote: >>> On 23.02.2016 11:37, Martin Maechler wrote: >>>>>>>>> nospam at altfeld-im de <nospam at altfeld-im.de> >>>>>>>>> on Mon, 22 Feb 2016 18:45:59 +0100 writes: >>>>
2016 Feb 29
1
iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)
I have just committed your first patch (the strlen() replacement) to R-devel, and will soon put it in R-patched as well. I wont have time to look at this again before the 3.2.4 release, so your file.show() patch isn't going to make it unless someone else gets to it. There's still a faint chance that I'll do more in R-devel before 3.3.0, but I think it's best if there were bug
2016 Feb 24
2
iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)
On 23/02/2016 7:06 AM, Mikko Korpela wrote: > On 23.02.2016 11:37, Martin Maechler wrote: >>>>>>> nospam at altfeld-im de <nospam at altfeld-im.de> >>>>>>> on Mon, 22 Feb 2016 18:45:59 +0100 writes: >> >> > Dear R developers >> > I think I have found a bug that can be reproduced with two lines of code
2016 Feb 25
2
iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)
On 23.02.2016 14:06, Mikko Korpela wrote: > On 23.02.2016 11:37, Martin Maechler wrote: >>>>>>> nospam at altfeld-im de <nospam at altfeld-im.de> >>>>>>> on Mon, 22 Feb 2016 18:45:59 +0100 writes: >> >> > Dear R developers >> > I think I have found a bug that can be reproduced with two lines of code >>
2016 Feb 16
2
iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)
If I execute the code from the "?write.table" examples section x <- data.frame(a = I("a \" quote"), b = pi) # (ommited code) write.csv(x, file = "foo.csv", fileEncoding = "UTF-16LE") the resulting CSV file has a size of 6 bytes which is too short (truncated): """,3 The problem seems to be the iconv function:
2016 Feb 23
4
iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)
>>>>> nospam at altfeld-im de <nospam at altfeld-im.de> >>>>> on Mon, 22 Feb 2016 18:45:59 +0100 writes: > Dear R developers > I think I have found a bug that can be reproduced with two lines of code > and I am very thankful to get your first assessment or feed-back on my > report. > If this is the wrong mailing list or I
2008 May 01
1
Locale problem with umlauts in factor levels in 2.7.0 (patched) from grid or lattice
With 2.7.0 patched (not tested with 2.0.0), I get an error message in a program that ran correctly in R 2.6.2 when the grouping factor of a stripplot contains an Umlaut: I am aware that there are a few locale-changes in R 2.7.0, but I could not easily locate who's at fault Dieter library(lattice) dt = data.frame(x=rnorm(100),y=1:100,levs= as.factor(c("Gru","Gr?")))
2008 Dec 15
0
mixed csv and csv2
Dear all, I have a huge problem after downloading and exporting data from Reuters3000 XTra: I downloaded many many monthly, quarterly and yearly data. I do not know why, but after saving, I have mixed-data sets, i.e. files which can be imported as “read.csv” and others that are in the format of “read.csv2”. Sure I could change them, but normally it should be possible to mix them… For