thr3ads.net - similar to: "locales and readLines"

Displaying 20 results from an estimated 5000 matches similar to: "locales and readLines"

How can I associate a list of defined names with the dataframes to be downloaded

2010 Feb 16

How can I associate a list of defined names with the dataframes to be downloaded

An embedded and charset-unspecified text was scrubbed... Name: ??????????? URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100216/9b6b4555/attachment.pl>

Locales and filenames

2009 Oct 27

Locales and filenames

I have a file which contains non-ASCII characters (umlauts, accented characters, etc.) both in its filename as well as its content. The only way I have been able to see these characters is inside vim, where they are displayed correctly no matter what I have LANG set to. My default LANG is en_US.utf8, but I have tried de_DE.utf8, de_DE.iso88591, and various others. In the output of a

readLines interaction with gsub different in R-dev

2018 Feb 17

readLines interaction with gsub different in R-dev

| Confirmed for R-devel (current) on Ubuntu 17.10. But ... isn't the regexp | you use wrong, ie isn't R-devel giving the correct answer? No, I don't think R-devel is correct (or at least consistent with the documentation). My interpretation of gsub("(\\w)", "\\U\\1", entry, perl = TRUE) is "Take every word character and replace it with itself, converted to

A GSE data in the web of ncbi, GSE3524 cannot be open correctly

2007 Jan 31

A GSE data in the web of ncbi, GSE3524 cannot be open correctly

Hi, all I met a problem to query GSE3524, which cannot be open on my computer. I hope some of you would be kind to give me some advice. Thanks! The code is as follow: ################## library(GEOquery) gsename="GSE3524" gse=getGEO(gsename) ################## The error information follows as

Windows iconv() "failure" in certain locales

2017 Jun 27

Windows iconv() "failure" in certain locales

This is a continuation of the R-devel thread with subject "suggestion to fix packageDescription() for Windows users" : As I said there, a patch should rather address the underlying problem in packageDescription rather than a kludgy workaround patch for citation(). (For that same reason, Ben Marwick proposed to fix packageDescription() rather than the symptom seen in citation().)

Windows iconv() "failure" in certain locales

2017 Jun 29

Windows iconv() "failure" in certain locales

>>>>> Uwe Ligges <ligges at statistik.tu-dortmund.de> >>>>> on Wed, 28 Jun 2017 18:45:59 +0200 writes: > On 27.06.2017 17:36, Martin Maechler wrote: >> This is a continuation of the R-devel thread with subject >> "suggestion to fix packageDescription() for Windows users" : >> >> As I said there, a

Sweave output encoding in R-2.10.0beta on Windows (Rgui <-> Rterm)

2009 Oct 13

Sweave output encoding in R-2.10.0beta on Windows (Rgui <-> Rterm)

Dear developers, I have come across a (somewhat strange) change in the encoding of Sweave output from R-2.9.2pat to R-2.10.0beta (apparently specific to Rgui) on Windows installations. Of course, the NEWS file contains quite a few changes concerning encoding, but I was not able to locate an entry which explains the observed behaviour. I am not very familiar with encodings/locales/codepages,

special latin1 do not print as glyphs in current devel on windows

2017 Sep 14

special latin1 do not print as glyphs in current devel on windows

This is a follow-up on my initial posts regarding character encodings on Windows (https://stat.ethz.ch/pipermail/r-devel/2017-August/074728.html) and Patrick Perry's reply (https://stat.ethz.ch/pipermail/r-devel/2017-August/074830.html) in particular (thank you for the links and the bug report!). My initial posts were quite chaotic (and partly wrong), so I am trying to clear things up a

special latin1 do not print as glyphs in current devel on windows

2017 Aug 01

special latin1 do not print as glyphs in current devel on windows

Thank you!. My apologies again for not including the console output in my message before. I sent another e-mail with the output in the meantime, so it should be a bit clearer now, what I am seeing. In case I missed something, please let me know. Yes, I am using latin1 and cp1252 interchangebly here, mostly because Encoding() is reporting the encoding as "latin1". You presumed correctly

special latin1 do not print as glyphs in current devel on windows

2017 Aug 01

special latin1 do not print as glyphs in current devel on windows

Upon further inspection, I think these are at least two problems. First the issue with printing latin1/cp1252 characters in the "80" to "9F" code range. x <- c("?", "?", "?") Encoding(x) print(x) I assume that these are Unicode escapes!? (Given that Encoding(x) shows "latin1" I'd rather expect latin1/cp1252 escapes here, but

Writing UTF8 on Windows

2014 Oct 19

Writing UTF8 on Windows

Recent functionality in jsonlite allows for streaming json to a user supplied connection object, such as a file, pipe or socket. RFC7159 prescribes json must be encoded as unicode; ISO-8859 (including latin1) is invalid. Hence I would like R to write strings as utf8, irrespective of the type of connection, platform or locale. Implementing this turns out to be unsurprisingly difficult on windows.

Error in substring: invalid multibyte string

2020 Jun 27

Error in substring: invalid multibyte string

Thanks for the quick response Ivan. readLines with encoding='latin1' works for me (on Ubuntu). However I was more concerned with the inconsistency in results between substr and regexpr. I was expecting that if one of them errors because of an unknown encoding then the other should as well. Even better, if regexpr works, why shouldn't substr work as well? Incidentally the analogous

reading and frequency analysis of Spanish text

2009 Aug 05

reading and frequency analysis of Spanish text

For an historical paper I'm working on, I have some Spanish plaintext, presently in the form of a Word .doc file, http://euclid.psych.yorku.ca/SCS/Gallery/images/Private/Langren/Verdadera-spanish-stripped.doc and also some ciphered text from the same original source. The ultimate goal is to use some frequency analysis of letters and word lengths in the plaintext to help decode the

Problemas al intentar cargar datos

2014 Oct 10

Problemas al intentar cargar datos

Hola, buenas tardes, Hace unos dias que intento cargar unos datos de microarrays del ncbi con versión de R 2.15.2 de 32 bits en windows xp. he utilizado el siguiente codigo: library(Biobase) library(GEOquery) library(limma) gset <- getGEO("GSE6536", GSEMatrix =TRUE) Al hacerlo me da este error: "Error in function (type, msg, asError = TRUE) : couldn't connect to

readLines interaction with gsub different in R-dev

2018 Feb 17

readLines interaction with gsub different in R-dev

I was told to re-raise this issue with R-dev: In the documentation of R-dev and R-3.4.3, under ?gsub > replacement > ... For perl = TRUE only, it can also contain "\U" or "\L" to convert the rest of the replacement to upper or lower case and "\E" to end case conversion. However, the following code runs differently: tempf <- tempfile()

R 2.9.2 crashes when sorting latin1-encoded strings

2009 Sep 30

R 2.9.2 crashes when sorting latin1-encoded strings

Hi everyone! I think I stumbled over a bug in the latest R 2.9.2 patched for OS X: > R version 2.9.2 Patched (2009-09-24 r49861) > i386-apple-darwin9.8.0 When I try to sort latin1-encoded character vectors, R sometimes crashes with a segmentation fault. I'm running OS X 10.5.8 and have observed this behaviour both with the i386 and x86_64 builds, in the R.app GUI as well as on

Reading in a table with ISO-latin1 encoding in MacOS-X (Intel)

2006 Jun 08

Reading in a table with ISO-latin1 encoding in MacOS-X (Intel)

Dear colleages in R, I have earlier been working with R in Linux, where reading in a table containing Scandinavian letters ("?", "?", and "?") in the header as part of variable names has not caused any problem whatsoever. However, when trying to do the same in R running on new MacOS-X (with an Intel processor) with the same original text table does not seem to

Strange characters in 2.1.0?

2005 Jun 06

Strange characters in 2.1.0?

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Signif. codes: 0 <80><98>***<80><99> 0.001 <80><98>**<80><99> 0.01 <80><98>*<80> <99> 0.05 <80><98>.<80><99> 0.1 <80><98> <80><99> 1 Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 hmm... they go away when I

ouml in an .Rd

2006 Jan 06

ouml in an .Rd

I am trying to put an ouml in an .Rd file with no success. Writing R Extensions suggests: Text which might need to be represented differently in different encodings should be marked by |\enc|, e.g. |\enc{J??reskog}{Joreskog}| where the first argument will be used where encodings are allowed and the second should be ASCII (and is used for e.g. the text conversion). (Above may get mangled by

ouml in an .Rd

2006 Jan 06

ouml in an .Rd

similar to: locales and readLines