similar to: gsub, utf-8 replacements and the C-locale

Displaying 20 results from an estimated 10000 matches similar to: "gsub, utf-8 replacements and the C-locale"

2013 May 01
1
Windows, format.POSIXct and character encodings
Hi all, In what encoding does format.POSIXct return its output? It doesn't seem to be utf-8: Sys.setlocale("LC_ALL", "Japanese_Japan.932") times <- c("1970-01-01 01:00:00 UTC", "1970-02-02 22:00:00 UTC") ampm <- format(as.POSIXct(times), format = "%p") x <- gsub(">", "*", paste(ampm, collapse =
2010 Dec 10
1
Consistency of variable storage in R and Sys.setlocale (is this a feature or bug)?
<I was not sure if this should go to R-devel or R-help. If I e-mailed this to the wrong place, please let me know.> Hello dear R-devel members, I came by an oddity, with regards to how character variables are being transformed when they are in Hebrew, and when Sys.setlocale is changed. Here is an example: # first, let's set the locale to Hebrew Sys.setlocale("LC_ALL",
2015 Jan 22
1
R CMD check: Locale not set to C?
Dear All The "R CMD check" on the "zoo" (1.7-11) package results in an error on my environment. It can be reduced to the following example: ---------------------------------------------------- > require(zoo) > read.zoo(system.file("doc", "demo1.txt", package = "zoo"), sep = "|", format="%d %b %Y") Error in
2010 Apr 15
1
Changing locale?
Hi I need for a specific application to change the locale of R 2.9.2 in Ubuntu 9.04. Trying the example in ?Sys.setlocale: Sys.setlocale("LC_TIME", "de_DE.utf8") [1] "" Warning message: In Sys.setlocale("LC_TIME", "de_DE.utf8") : la requ?te OS pour sp?cifier la localisation ? "de_DE.utf8" n'a pas pu ?tre honor?e I tried the code
2023 May 30
3
why does [A-Z] include 'T' in an Estonian locale?
Inspired by this old Stack Overflow question https://stackoverflow.com/questions/19765610/when-does-locale-affect-rs-regular-expressions I was wondering why this is TRUE: Sys.setlocale("LC_ALL", "et_EE") grepl("[A-Z]", "T") TRE's documentation at <https://laurikari.net/tre/documentation/regex-syntax/> says that a range "is shorthand for
2014 Apr 30
2
make fullcheck fails: strtod/atof and locale
make fullcheck fails on my computer: flac cannot recognize --skip option that contains decimal point, e.g. "--skip=1.234". System locale uses comma as a separator, so strtod/atof expect comma, not point, and "make fullcheck" fails. Here's what I can see in FLAC source code: atof() function found in: file: src/share/grabbag/seektable.c function:
2014 Jul 28
1
Parsing and deparsing of escaped unicode characters
In both R and JSON (and many other languages), unicode characters can be escaped using a backslash followed by a lowercase "u" and a 4 digit hex code. However when deparsing a character vector in R on Windows, the non-latin characters get escaped as "<U+" followed by their 4 digit hex code and ">": > x <- "I like \u5BFF\u53F8" > cat(x) I like
2012 Oct 24
2
R CMD BATCH: set locale?
Hi I would like to change the locale when using R CMD BATCH. Usually, if I want to run it in english, for R in console/GUIs, I edit the .Rprofile file, adding: Sys.setlocale("LC_ALL","en_US.UTF8") Sys.setlocale("LC_MESSAGES","en_US.UTF8") But while this works for interactive R, it does not for R CMD BATCH. The problem is that running tests for a package,
2009 Jul 01
2
locale changing on Windows
Dear r-helpers, This is a little bit more of a Windows problem than an R problem, but ... any idea how to query the *available* locales from within R (or otherwise) on a Windows system? Teaching in a Spanish-language setting and would like to do something like Sys.setlocale("LC_TIME","en_US") (for example so that we can convert dates like "1970-jan-01" with
2011 May 04
1
issue with "strange" characters (locale settings)
WinXP-x32, R-21.13.0 Dear list, I have a problem that (I think) relates to the interaction between Windows and R. I am trying to scrape a table with data on the Hawai'ian Islands, This is my code: library(XML) u <- "http://en.wikipedia.org/wiki/Hawaii" tables <- readHTMLTable(u) Islands <- tables[[5]] The output is (first set of columns):
2017 Jun 23
2
LC_TIME not set correctly by Sys.setlocale() ?
Related to the following question on Stackoverflow: https://stackoverflow.com/questions/44723690/unexpected-behavior-of-sys-setlocale#44723690 It appears as if Sys.setlocale() does not update LC_TIME correctly for use in date formatting. Although R reports that LC_TIME is changed to the new setting after use of Sys.setlocale(), as.Date() still uses the old settings. The only way to update this is
2017 Jun 19
0
\U or \L perl regex in gsub removes text outside capturing group in UTF-8 contexts
I write to clarify the status of \U and \L when used in the replacement argument to gsub in R 3.5.0. The behaviour of gsub appears to have changed from R 3.4.0, but the documentation for the replacement argument has not. ## Reprex (A call to readLines is essential. A url is provided for convenience but the behaviour should reproduce for local files) bib <- readLines("
2007 Mar 11
1
Sys.setlocale("LC_CTYPE","fr_FR.UTF-8")
Dear R users, I'm trying to have a gWiddgetsRGtk2 script run under R-2.4.1. The script run OK under Linux but all accentuated characters appear as "?" when the script is run under Windows. As Gtk+ requires UTF-8, I thought it was the source of the problem and tried to change the default encoding (1252) in the following way:
2015 Jul 06
7
[PATCH 1/1] paint visual host key with unicode box-drawing characters
From: Christian Hesse <mail at eworm.de> Signed-off-by: Christian Hesse <mail at eworm.de> --- sshkey.c | 47 ++++++++++++++++++++++++++++++++++++----------- 1 file changed, 36 insertions(+), 11 deletions(-) diff --git a/sshkey.c b/sshkey.c index cfe5980..47511c2 100644 --- a/sshkey.c +++ b/sshkey.c @@ -44,6 +44,9 @@ #include <stdio.h> #include <string.h> #include
2013 Sep 09
2
Invalid UTF-8 with gsub(perl=TRUE) and iconv(sub="")
Hi! I experience an error with an invalid UTF-8 character passed to gsub(..., perl=TRUE); the interesting point is that with perl=FALSE (the default) no error happens. (The character itself was read from an invalid HTML file.) Illustration of the error: gsub("a", "", "\U3e3965", perl=FALSE) # [1] "\U3e3965" gsub("a", "",
2003 Dec 05
1
How to use Sys.setlocale("LC_NUMERIC")?
Can you help me to use Sys.setlocale("LC_NUMERIC", "cs_CZ") (comma as a decimal point) in some useful way, without all the workarounds? After switching to Sys.setlocale("LC_NUMERIC", "cs_CZ"): -- How do I set attributes in read.csv2() not to get columns of real numbers (decimal point = comma, field separator = semicolon) as factors? Wokrkaround: I can go
2010 Dec 07
1
Encoding problem - I fails to read Hebrew text from online
Hello all, # I am trying to read the text in this URL: u <- http://google.com/complete/search?output=toolbar&q=%d7%a9%d7%9c%d7%95%d7%9d # By using this command: readLines(u) And no matter what variation I tried, I keep getting this output: [1] "<?xml version=\"1.0\"?><toplevel><CompleteSuggestion><suggestion
2023 Jun 01
1
why does [A-Z] include 'T' in an Estonian locale?
On 5/30/23 17:45, Ben Bolker wrote: > Inspired by this old Stack Overflow question > > https://stackoverflow.com/questions/19765610/when-does-locale-affect-rs-regular-expressions > > > I was wondering why this is TRUE: > > Sys.setlocale("LC_ALL", "et_EE") > grepl("[A-Z]", "T") > > TRE's documentation at >
2017 May 19
2
test fails when requesting LC_CTYPE
On RedHat Enterprise Linux 6, the test below fails (this is using the stock GCC 4.4.7) from R-devel r72707. LC_CTYPE is unset when I run it, but LANG=en_US.UTF-8 It also failed "yesterday" where as far as I recall the test code looked a bit different. Best, Kasper > ## Results differed by platform, but some gave incorrect results on string 10. > > > ## str() on large
2015 Apr 09
5
A vueltas con los UTF-8 en RStudio
Hola, otra vez a vueltas con los UTF8, seguro que es un tema sempiterno de esta lista y que ya se ha contestado, regannadme por ello (y por escribir sin acentos). Genero un .rda en unix con el system default UTF8 y me lo traigo a un windows. Tengo el Rstudio en windows configurado con Global Options > Default text encoding UTF8. Cargo el .rda con load y nada, los acentos a la porra. Vamos, que