thr3ads.net - similar to: "Windows, format.POSIXct and character encodings"

Displaying 20 results from an estimated 3000 matches similar to: "Windows, format.POSIXct and character encodings"

Errors on Windows with grep(fixed=TRUE) on UTF-8 strings

2015 Mar 02

Errors on Windows with grep(fixed=TRUE) on UTF-8 strings

On Windows, grep(fixed=TRUE) throws errors with some UTF-8 strings. Here's an example (must be run on Windows to reproduce the error): Sys.setlocale("LC_CTYPE", "chinese") y <- rawToChar(as.raw(c(0xe6, 0xb8, 0x97))) Encoding(y) <- "UTF-8" y # [1] "?" grep("\n", y, fixed = TRUE) # Error in grep("\n", y, fixed = TRUE) : invalid

unordered multinomial logistic regression (or logit model) with repeated measures (I think)

2009 Oct 08

unordered multinomial logistic regression (or logit model) with repeated measures (I think)

I am attempted to examine the temporal independence of my data set and think I need an unordered multinomial logistic regression (or logit model) with repeated measures to do so. The data in question is location of chickens. Chickens could be in any one of 5 locations when a snapshot sample was taken. The locations of chickens (bird) in 8 pens (pen) were scored twice a day (AMPM) for 20 days

Is this surprising behavior of tkimage.create function a bug?

2020 Jul 09

Is this surprising behavior of tkimage.create function a bug?

tkimage.create function can read some images but can't read the other images. We can reproduce it by running the code below. ------------------------------------------------------------------------------------- library(tcltk) library(magick) # works fine tmp <- tempfile(fileext = ".gif") image_write(logo, tmp) image_tcl <- tkimage.create("photo",

"read.table" and "scan" skips newlines which "count.fields" finds in Thai textfile

2010 Feb 03

"read.table" and "scan" skips newlines which "count.fields" finds in Thai textfile

Hi there, I have some problems reading in a Thai text. Some of the newlines are skipped. (see the contents of my file below) R>count.fields ("my.txt", sep='\n', quote="") [1] 1 1 1 Three lines with one item each, right? R> scan("my.txt", what="", sep="\t", quote="") Read 2 items [1] "?\x83???\x88

rawToChar(raw(0))

2008 May 21

rawToChar(raw(0))

Hi, right now we have (on R v2.7.0 patched (2008-04-23 r45466)) that: > rawToChar(raw(0)) [1] "" > rawToChar(raw(0), multiple=TRUE) character(0) Is this intended or should both return character(0)? Personally, I would prefer that an empty input vector returns an empty output vector. Same should then apply to charToRaw(), but right now we get: > x <- character(0) >

grep and PCRE fun

2011 Sep 29

grep and PCRE fun

Hello, I think I've found a bug in the C function do_grep located in src/main/grep.c. It seems to affect both the latest revisions of R-2-13-branch and trunk when compiling R without optimizations and with it's own version of pcre located in src/extra, at least on ubuntu 10.04. According to the pcre_exec API (I presume the later versions), the ovecsize argument must be a multiple of 3 ,

writeLines argument useBytes = TRUE still making conversions

2018 Feb 17

writeLines argument useBytes = TRUE still making conversions

Of course, right after writing this e-mail I tested on my Windows machine and did not see what I expected: > charToRaw(before) [1] c3 a9 > charToRaw(after) [1] e9 so obviously I'm misunderstanding something as well. Best, Kevin On Sat, Feb 17, 2018 at 2:19 PM, Kevin Ushey <kevinushey at gmail.com> wrote: > From my understanding, translation is implied in this line of ?file

subRaw?

2012 Jul 20

subRaw?

Hello, All: Do you know of any capability to substitute more then one byte in an object of class Raw? Consider the following: > let4 <- paste(letters[1:4], collapse='') > (let4Raw <- charToRaw(let4)) [1] 61 62 63 64 > (let. <- sub('bc', '--', let4Raw)) [1] "61" "62" "63" "64" > # no

R 2.7.0, match() and strings containing \0 - bug?

2008 Apr 28

R 2.7.0, match() and strings containing \0 - bug?

Hi, A piece of my code that uses readBin() to read a certain file type is behaving strangely with R 2.7.0. This seems to be because of a failure to match() strings after using rawToChar() when the original was terminated with a "\0" character. Direct equality testing with == still works as expected. I can reproduce this as follows: > x <- "foo" > y <-

writeLines argument useBytes = TRUE still making conversions

2018 Feb 15

writeLines argument useBytes = TRUE still making conversions

On Thu, Feb 15, 2018 at 11:19 AM, Kevin Ushey <kevinushey at gmail.com> wrote: > I suspect your UTF-8 string is being stripped of its encoding before > write, and so assumed to be in the system native encoding, and then > re-encoded as UTF-8 when written to the file. You can see something > similar with: > > > tmp <- '?' > > tmp <- iconv(tmp,

Consistency of serialize(): please enlighten me

2007 Aug 31

Consistency of serialize(): please enlighten me

Hi, I am puzzled with serialize(). It comes down generating identical hash codes for (apparently) identical objects using digest::digest(), which in turn relies on serialize(). Here is an example illustration the issue: ser <- function(object, ...) { list( names = names(object), namesRaw = charToRaw(names(object)), ser = serialize(names(object), connection=NULL, ascii=FALSE)

Split String in regex while Keeping Delimiter

2023 Apr 13

Split String in regex while Keeping Delimiter

Dear Emily, Using a look-behind solves the split problem in this case. (Note: Using Regex is in most/many cases the simplest solution.) str = c("leucocyten + gramnegatieve staven +++ grampositieve staven ++", "leucocyten ? grampositieve coccen +") tokens = strsplit(str, "(?<=[-+])\\s++", perl=TRUE) PROBLEM The current expression does NOT work for a different

Paging MEETME_RECORDINGFILE Variable

2007 Sep 20

Paging MEETME_RECORDINGFILE Variable

I am having a weird issue with setting the recording file for the Page app. Here is some quick background info I have a macro that pages all my phones: [macro-pageall] ; Context for paging all devices. ; This will search the sip table in the realtime database ; for all phones that start with a number. That number is ; passed to this macro as ${ARG1}. ; ; ARG1 = The

getting corrupted data when using readBin() after seek() on a gzfile connection

2013 May 08

getting corrupted data when using readBin() after seek() on a gzfile connection

Hi, I'm running into more issues when reading data from a gzfile connection. If I read the data sequentially with successive calls to readBin(), the data I get looks ok. But if I call seek() between the successive calls to readBin(), I get corrupted data. Here is a (hopefully) reproducible example. See my sessionInfo() at the end (I'm not on Windows, where, according to the man page,

writeLines argument useBytes = TRUE still making conversions

2018 Feb 15

writeLines argument useBytes = TRUE still making conversions

I think this behavior is inconsistent with the documentation: tmp <- '?' tmp <- iconv(tmp, to = 'UTF-8') print(Encoding(tmp)) print(charToRaw(tmp)) tmpfilepath <- tempfile() writeLines(tmp, con = file(tmpfilepath, encoding = 'UTF-8'), useBytes = TRUE) [1] "UTF-8" [1] c3 a9 Raw text as hex: c3 83 c2 a9 If I switch to useBytes = FALSE, then

More elegant magnitude method

2010 Dec 07

More elegant magnitude method

I have a need to find the order of number to get a scaling parameter as a power of 10. I have a function that works *so far*, but it is ugly and probably buggy. In the interest of avoiding code-based outliers in my data, I thought I would ask if anyone here has a better way. > scl <- function(x){ + length(charToRaw(format(trunc(x), scientific = F)))-1} > a <- 123456789 > b <-

RFE: vectorize URLdecode

2009 Aug 29

RFE: vectorize URLdecode

In R 2.9.2, > URLdecode(c("a%20b", "b%20c")) [1] "a b" Warning message: In charToRaw(URL) : argument should be a character vector of length 1 all but the first element will be ignored Could URLdecode be modified to actually process all elements of the vector, not just the first? Thanks in advance

User input(unknown name and number of files)

2011 Jul 21

User input(unknown name and number of files)

Dear all, I need your help as I was not able to find out the solution. The thing is- I am having a code which is reading file with this code- df=read.table("Case2.pileup",fill=T,sep="\t",colClasses="character") but as am making a tool so that user can use it and can do analysis on his file.But the name of the file will not be Case2.pileup and I want to use this

special latin1 do not print as glyphs in current devel on windows

2017 Sep 14

special latin1 do not print as glyphs in current devel on windows

This is a follow-up on my initial posts regarding character encodings on Windows (https://stat.ethz.ch/pipermail/r-devel/2017-August/074728.html) and Patrick Perry's reply (https://stat.ethz.ch/pipermail/r-devel/2017-August/074830.html) in particular (thank you for the links and the bug report!). My initial posts were quite chaotic (and partly wrong), so I am trying to clear things up a

odd behavior of names

2018 Jul 29

odd behavior of names

Bugzilla issue 16101 describes another first-list-name-printed-differently oddity with the Windows GUI version of R: > a <- "One is \u043E\u0434\u0438\u043D\nTwo is \u0434\u0432\u0430\n" > Encoding(a) # expect "UTF-8" [1] "UTF-8" > sapply(strsplit(a, "\n")[[1]], charToRaw)[c(1,1,2)] $`One is ????` [1] 4f 6e 65 20 69 73 20 d0 be d0 b4 d0 [13] b8

similar to: Windows, format.POSIXct and character encodings