Displaying 16 results from an estimated 16 matches for "mbcslocal".
Did you mean:
mbcslocale
2015 Mar 02
2
Errors on Windows with grep(fixed=TRUE) on UTF-8 strings
On Windows, grep(fixed=TRUE) throws errors with some UTF-8 strings.
Here's an example (must be run on Windows to reproduce the error):
Sys.setlocale("LC_CTYPE", "chinese")
y <- rawToChar(as.raw(c(0xe6, 0xb8, 0x97)))
Encoding(y) <- "UTF-8"
y
# [1] "?"
grep("\n", y, fixed = TRUE)
# Error in grep("\n", y, fixed = TRUE) : invalid
2015 Mar 04
0
Errors on Windows with grep(fixed=TRUE) on UTF-8 strings
...e")
grep("a", y, fixed = TRUE)
# Error in grep("a", y, fixed = TRUE) : invalid multibyte string at '<97>'
=======================
I believe the problem is in the main/grep.c file, in the fgrep_one
function. It tests for a multi-byte character string locale
`mbcslocale`, and then for the `use_UTF8`, like so:
if (!useBytes && mbcslocale) {
...
} else if (!useBytes && use_UTF8) {
...
} else ...
This can be seen at
https://github.com/wch/r-source/blob/e92b4c1cba05762480cd3898335144e5dd111cb7/src/main/grep.c#L668-L692
A...
2007 Jun 24
2
problem gsub in the locale of CP932 and SJIS (PR#9751)
...0.orig/src/main/character.c 2007-04-03 11:05:05.000000000 +0900
+++ R-2.5.0/src/main/character.c 2007-06-24 22:31:06.000000000 +0900
@@ -986,6 +986,17 @@
char *p = repl;
n = strlen(repl) - (regmatch[0].rm_eo - regmatch[0].rm_so);
while (*p) {
+#ifdef SUPPORT_MBCS
+ if(mbcslocale){
+ int clen;
+ mbstate_t mb_st;
+ mbs_init(&mb_st);
+ if((clen = Mbrtowc(NULL, p, MB_CUR_MAX, &mb_st)) > 1){
+ p+=clen;
+ continue;
+ }
+ }
+#endif
if (*p == '\\') {
if ('...
2023 Jan 31
1
Sys.getenv(): Error in substring(x, m + 1L) : invalid multibyte string at '<ff>' if an environment variable contains \xFF
...!= NULL; i++, e++)
- SET_STRING_ELT(ans, i, mkChar(*e));
+ for (i = 0, e = environ; *e != NULL; i++, e++) {
+ cetype_t enc = known_to_be_latin1 ? CE_LATIN1 :
+ known_to_be_utf8 ? CE_UTF8 :
+ CE_NATIVE;
+ if (
+ (utf8locale && !utf8Valid(*e))
+ || (mbcslocale && !mbcsValid(*e))
+ ) enc = CE_BYTES;
+ SET_STRING_ELT(ans, i, mkCharCE(*e, enc));
+ }
#endif
} else {
PROTECT(ans = allocVector(STRSXP, i));
@@ -416,11 +424,14 @@
if (s == NULL)
SET_STRING_ELT(ans, j, STRING_ELT(CADR(args), 0));
else {
- SEXP tmp;
- if(kn...
2023 Jan 31
1
Sys.getenv(): Error in substring(x, m + 1L) : invalid multibyte string at '<ff>' if an environment variable contains \xFF
..._ELT(ans, i, mkChar(*e));
> + for (i = 0, e = environ; *e != NULL; i++, e++) {
> + cetype_t enc = known_to_be_latin1 ? CE_LATIN1 :
> + known_to_be_utf8 ? CE_UTF8 :
> + CE_NATIVE;
> + if (
> + (utf8locale && !utf8Valid(*e))
> + || (mbcslocale && !mbcsValid(*e))
> + ) enc = CE_BYTES;
> + SET_STRING_ELT(ans, i, mkCharCE(*e, enc));
> + }
> #endif
> } else {
> PROTECT(ans = allocVector(STRSXP, i));
> @@ -416,11 +424,14 @@
> if (s == NULL)
> SET_STRING_ELT(ans, j, STRING_ELT(C...
2023 Jan 30
2
Sys.getenv(): Error in substring(x, m + 1L) : invalid multibyte string at '<ff>' if an environment variable contains \xFF
/Hello.
SUMMARY:
$ BOOM=$'\xFF' LC_ALL=en_US.UTF-8 Rscript --vanilla -e "Sys.getenv()"
Error in substring(x, m + 1L) : invalid multibyte string at '<ff>'
$ BOOM=$'\xFF' LC_ALL=en_US.UTF-8 Rscript --vanilla -e "Sys.getenv('BOOM')"
[1] "\xff"
BACKGROUND:
I launch R through an Son of Grid Engine (SGE) scheduler, where the R
2007 Sep 13
1
chartr better
...rn a->c_old - b->c_old;
+}
+static inline int xtable_key_comp(const wchar_t *a, const xtable_t *b)
+{
+ return *a - b->c_old;
+}
+
SEXP attribute_hidden do_chartr(SEXP call, SEXP op, SEXP args, SEXP env)
{
SEXP old, _new, x, y;
@@ -2064,14 +2074,18 @@
#ifdef SUPPORT_MBCS
if(mbcslocale) {
int j, nb, nc;
- wchar_t xtable[65536 + 1], c_old, c_new, *wc;
+ xtable_t *xtable;
+ int xtable_cnt;
+ wchar_t c_old, c_new, *wc;
const char *xi, *s;
struct wtr_spec *trs_old, **trs_old_ptr;
struct wtr_spec *trs_new, **trs_ne...
2023 Jan 31
2
Sys.getenv(): Error in substring(x, m + 1L) : invalid multibyte string at '<ff>' if an environment variable contains \xFF
...gt;
>> * I don't know whether known_to_be_utf8 can disagree with utf8locale.
>> known_to_be_utf8 was the original condition for setting CE_UTF8 on
>> the string. I also need to detect non-UTF-8 multibyte locales, so
>> I'm checking for utf8locale and mbcslocale. Perhaps I should be more
>> careful and test for (enc == CE_UTF8) || (utf8locale && enc ==
>> CE_NATIVE) instead of just utf8locale.
>>
>> * I have verified that Sys.getenv() doesn't crash with UTF-8-invalid
>> strings in the environm...
2016 Jan 27
2
rstan warning messages
Confirmed that gcc-gfortran is installed
Package gcc-gfortran-4.4.7-16.el6.x86_64 already installed and latest version
What could I check next?
I do not have the following installed and will get that done and tested again.
libcurl-devel
libidn-devel
Thanks,
Larry
-----Original Message-----
From: Tom Callaway [mailto:tcallawa at redhat.com]
Sent: Wednesday, January 27, 2016
2016 Jan 28
2
rstan warning messages
...FALSE, use_wcs = FALSE;
^
myUTF8.c:112:14: warning: variable 'oct_or_hex' set but not used [-Wunused-but-set-variable]
Rboolean oct_or_hex = FALSE, use_wcs = FALSE;
^
myUTF8.c:101:14: warning: unused variable 'mbcslocale' [-Wunused-variable]
Rboolean mbcslocale = TRUE;
^
*** installing help indices
converting help for package 'RCurl'
finding HTML links ... done
...
binaryBuffer html
Rd warning: /tmp/RtmpaLgRR...
2013 May 01
1
Windows, format.POSIXct and character encodings
Hi all,
In what encoding does format.POSIXct return its output? It doesn't
seem to be utf-8:
Sys.setlocale("LC_ALL", "Japanese_Japan.932")
times <- c("1970-01-01 01:00:00 UTC", "1970-02-02 22:00:00 UTC")
ampm <- format(as.POSIXct(times), format = "%p")
x <- gsub(">", "*", paste(ampm, collapse =
2005 Jul 20
1
(PR#8017) build of REventLoop package crashes with 2.1 due
...for stdin
> */
> ---
>> extern char* R_TempDir INI_as(NULL); /* Name of per-session dir */
> 530d528
> < extern void R_setupHistory();
> 541,542c539
> < LibExtern Rboolean utf8locale INI_as(FALSE); /* is this a UTF-8 locale? */
> < LibExtern Rboolean mbcslocale INI_as(FALSE); /* is this a MBCS locale? */
> ---
>> extern Rboolean utf8locale INI_as(FALSE); /* is this a UTF-8 locale? */
> 596a594
>> # define duplicated Rf_duplicated
> 633c631
> < # define Mbrtowc Rf_mbrtowc
> ---
>> # define mat...
2005 Jul 19
0
build of REventLoop package crashes with 2.1 due tosyntax error in Defn.h (PR#8017)
...""); /* Encoding assumed for stdin
*/
---
> extern char* R_TempDir INI_as(NULL); /* Name of per-session dir */
530d528
< extern void R_setupHistory();
541,542c539
< LibExtern Rboolean utf8locale INI_as(FALSE); /* is this a UTF-8 locale? */
< LibExtern Rboolean mbcslocale INI_as(FALSE); /* is this a MBCS locale? */
---
> extern Rboolean utf8locale INI_as(FALSE); /* is this a UTF-8 locale? */
596a594
> # define duplicated Rf_duplicated
633c631
< # define Mbrtowc Rf_mbrtowc
---
> # define match Rf_match
68...
2009 Apr 09
3
type.convert (PR#13646)
Full_Name: Stefan Raberger
Version: 2.8.1
OS: Windows XP
Submission from: (NULL) (213.185.163.242)
Hi there,
I recently noticed some strange behaviour of the command "type.convert",
depending on the startup mode used. But there also seems to be different
behaviour on different PCs (all running the same OS and the same version of R).
On PC1:
When I start R in SDI mode (RGui --no-save
2012 Jan 09
2
[R] fix and edit don't work: unable to open X Input
(moved from R-help)
I tried this on Ubuntu with R-2.14.1 built from source, and I do not get
the segfault problem. (I don't at the moment have a debian binary R, or
I would confirm whether I get the segfault problem.) My sessioninfo() is
reporting additional information about namespace imports:
> library(ggplot2)
Loading required package: reshape
Loading required package: plyr
2015 Aug 14
2
Build R on Haiku
Hi R-devel,
I'm trying to get R 3.2.1 working on Haiku (an open source OS inspired by
BeOS, not Linux based) on i586. With a few small changes to library paths
and ifdefs I am able to get a seemingly working R binary. The build process
stops with the 'tools' package. The last lines from make are below.
Does anyone have any tips? I'm rather new to debugging at this low level.
Are