Displaying 20 results from an estimated 10000 matches similar to: "gsub, utf-8 replacements and the C-locale"
2013 May 01
1
Windows, format.POSIXct and character encodings
Hi all,
In what encoding does format.POSIXct return its output? It doesn't
seem to be utf-8:
Sys.setlocale("LC_ALL", "Japanese_Japan.932")
times <- c("1970-01-01 01:00:00 UTC", "1970-02-02 22:00:00 UTC")
ampm <- format(as.POSIXct(times), format = "%p")
x <- gsub(">", "*", paste(ampm, collapse =
2010 Dec 10
1
Consistency of variable storage in R and Sys.setlocale (is this a feature or bug)?
<I was not sure if this should go to R-devel or R-help. If I e-mailed this
to the wrong place, please let me know.>
Hello dear R-devel members,
I came by an oddity, with regards to how character variables are being
transformed when they are in Hebrew, and when Sys.setlocale is changed.
Here is an example:
# first, let's set the locale to Hebrew
Sys.setlocale("LC_ALL",
2015 Jan 22
1
R CMD check: Locale not set to C?
Dear All
The "R CMD check" on the "zoo" (1.7-11) package results in an error on my
environment. It can be reduced to the following example:
----------------------------------------------------
> require(zoo)
> read.zoo(system.file("doc", "demo1.txt", package = "zoo"), sep = "|",
format="%d %b %Y")
Error in
2010 Apr 15
1
Changing locale?
Hi
I need for a specific application to change the locale of R 2.9.2 in
Ubuntu 9.04. Trying the example in ?Sys.setlocale:
Sys.setlocale("LC_TIME", "de_DE.utf8")
[1] ""
Warning message:
In Sys.setlocale("LC_TIME", "de_DE.utf8") :
la requ?te OS pour sp?cifier la localisation ? "de_DE.utf8" n'a pas pu
?tre honor?e
I tried the code
2023 May 30
3
why does [A-Z] include 'T' in an Estonian locale?
Inspired by this old Stack Overflow question
https://stackoverflow.com/questions/19765610/when-does-locale-affect-rs-regular-expressions
I was wondering why this is TRUE:
Sys.setlocale("LC_ALL", "et_EE")
grepl("[A-Z]", "T")
TRE's documentation at
<https://laurikari.net/tre/documentation/regex-syntax/> says that a
range "is shorthand for
2014 Apr 30
2
make fullcheck fails: strtod/atof and locale
make fullcheck fails on my computer: flac cannot recognize --skip option
that contains decimal point, e.g. "--skip=1.234".
System locale uses comma as a separator, so strtod/atof expect comma, not point,
and "make fullcheck" fails.
Here's what I can see in FLAC source code:
atof() function found in:
file: src/share/grabbag/seektable.c
function:
2014 Jul 28
1
Parsing and deparsing of escaped unicode characters
In both R and JSON (and many other languages), unicode characters can
be escaped using a backslash followed by a lowercase "u" and a 4 digit
hex code. However when deparsing a character vector in R on Windows,
the non-latin characters get escaped as "<U+" followed by their 4
digit hex code and ">":
> x <- "I like \u5BFF\u53F8"
> cat(x)
I like
2012 Oct 24
2
R CMD BATCH: set locale?
Hi
I would like to change the locale when using R CMD BATCH. Usually, if I
want to run it in english, for R in console/GUIs, I edit the .Rprofile
file, adding:
Sys.setlocale("LC_ALL","en_US.UTF8")
Sys.setlocale("LC_MESSAGES","en_US.UTF8")
But while this works for interactive R, it does not for R CMD BATCH. The
problem is that running tests for a package,
2009 Jul 01
2
locale changing on Windows
Dear r-helpers,
This is a little bit more of a Windows problem than
an R problem, but ...
any idea how to query the *available* locales from
within R (or otherwise) on a Windows system? Teaching
in a Spanish-language setting and would like to do
something like
Sys.setlocale("LC_TIME","en_US")
(for example so that we can convert dates like
"1970-jan-01" with
2011 May 04
1
issue with "strange" characters (locale settings)
WinXP-x32, R-21.13.0
Dear list,
I have a problem that (I think) relates to the interaction between Windows
and R.
I am trying to scrape a table with data on the Hawai'ian Islands, This is my
code:
library(XML)
u <- "http://en.wikipedia.org/wiki/Hawaii"
tables <- readHTMLTable(u)
Islands <- tables[[5]]
The output is (first set of columns):
2017 Jun 23
2
LC_TIME not set correctly by Sys.setlocale() ?
Related to the following question on Stackoverflow:
https://stackoverflow.com/questions/44723690/unexpected-behavior-of-sys-setlocale#44723690
It appears as if Sys.setlocale() does not update LC_TIME correctly for use
in date formatting. Although R reports that LC_TIME is changed to the new
setting after use of Sys.setlocale(), as.Date() still uses the old
settings. The only way to update this is
2017 Jun 19
0
\U or \L perl regex in gsub removes text outside capturing group in UTF-8 contexts
I write to clarify the status of \U and \L when used in the replacement
argument to gsub in R 3.5.0. The behaviour of gsub appears to have changed
from R 3.4.0, but the documentation for the replacement argument has not.
## Reprex (A call to readLines is essential. A url is provided for
convenience but the behaviour should reproduce for local files)
bib <- readLines("
2007 Mar 11
1
Sys.setlocale("LC_CTYPE","fr_FR.UTF-8")
Dear R users,
I'm trying to have a gWiddgetsRGtk2 script run under R-2.4.1. The script
run OK under Linux but all accentuated characters appear as "?" when the
script is run under Windows.
As Gtk+ requires UTF-8, I thought it was the source of the problem and
tried to change the default encoding (1252) in the following way:
2015 Jul 06
7
[PATCH 1/1] paint visual host key with unicode box-drawing characters
From: Christian Hesse <mail at eworm.de>
Signed-off-by: Christian Hesse <mail at eworm.de>
---
sshkey.c | 47 ++++++++++++++++++++++++++++++++++++-----------
1 file changed, 36 insertions(+), 11 deletions(-)
diff --git a/sshkey.c b/sshkey.c
index cfe5980..47511c2 100644
--- a/sshkey.c
+++ b/sshkey.c
@@ -44,6 +44,9 @@
#include <stdio.h>
#include <string.h>
#include
2013 Sep 09
2
Invalid UTF-8 with gsub(perl=TRUE) and iconv(sub="")
Hi!
I experience an error with an invalid UTF-8 character passed to
gsub(..., perl=TRUE); the interesting point is that with perl=FALSE (the
default) no error happens. (The character itself was read from an
invalid HTML file.) Illustration of the error:
gsub("a", "", "\U3e3965", perl=FALSE)
# [1] "\U3e3965"
gsub("a", "",
2003 Dec 05
1
How to use Sys.setlocale("LC_NUMERIC")?
Can you help me to use Sys.setlocale("LC_NUMERIC", "cs_CZ") (comma as a
decimal point) in some useful way, without all the workarounds?
After switching to Sys.setlocale("LC_NUMERIC", "cs_CZ"):
-- How do I set attributes in read.csv2() not to get columns of real
numbers (decimal point = comma, field separator = semicolon) as factors?
Wokrkaround: I can go
2010 Dec 07
1
Encoding problem - I fails to read Hebrew text from online
Hello all,
# I am trying to read the text in this URL:
u <-
http://google.com/complete/search?output=toolbar&q=%d7%a9%d7%9c%d7%95%d7%9d
# By using this command:
readLines(u)
And no matter what variation I tried, I keep getting this output:
[1] "<?xml version=\"1.0\"?><toplevel><CompleteSuggestion><suggestion
2023 Jun 01
1
why does [A-Z] include 'T' in an Estonian locale?
On 5/30/23 17:45, Ben Bolker wrote:
> Inspired by this old Stack Overflow question
>
> https://stackoverflow.com/questions/19765610/when-does-locale-affect-rs-regular-expressions
>
>
> I was wondering why this is TRUE:
>
> Sys.setlocale("LC_ALL", "et_EE")
> grepl("[A-Z]", "T")
>
> TRE's documentation at
>
2017 May 19
2
test fails when requesting LC_CTYPE
On RedHat Enterprise Linux 6, the test below fails (this is using the stock
GCC 4.4.7) from R-devel r72707. LC_CTYPE is unset when I run it, but
LANG=en_US.UTF-8
It also failed "yesterday" where as far as I recall the test code looked a
bit different.
Best,
Kasper
> ## Results differed by platform, but some gave incorrect results on
string 10.
>
>
> ## str() on large
2015 Apr 09
5
A vueltas con los UTF-8 en RStudio
Hola, otra vez a vueltas con los UTF8, seguro que es un tema sempiterno de esta lista y que ya se ha contestado, regannadme por ello (y por escribir sin acentos).
Genero un .rda en unix con el system default UTF8 y me lo traigo a un windows.
Tengo el Rstudio en windows configurado con Global Options > Default text encoding UTF8.
Cargo el .rda con load y nada, los acentos a la porra. Vamos, que