Displaying 20 results from an estimated 3000 matches similar to: "Windows, format.POSIXct and character encodings"
2015 Mar 02
2
Errors on Windows with grep(fixed=TRUE) on UTF-8 strings
On Windows, grep(fixed=TRUE) throws errors with some UTF-8 strings.
Here's an example (must be run on Windows to reproduce the error):
Sys.setlocale("LC_CTYPE", "chinese")
y <- rawToChar(as.raw(c(0xe6, 0xb8, 0x97)))
Encoding(y) <- "UTF-8"
y
# [1] "?"
grep("\n", y, fixed = TRUE)
# Error in grep("\n", y, fixed = TRUE) : invalid
2009 Oct 08
1
unordered multinomial logistic regression (or logit model) with repeated measures (I think)
I am attempted to examine the temporal independence of my data set and think
I need an unordered multinomial logistic regression (or logit model) with
repeated measures to do so. The data in question is location of chickens.
Chickens could be in any one of 5 locations when a snapshot sample was
taken. The locations of chickens (bird) in 8 pens (pen) were scored twice a
day (AMPM) for 20 days
2020 Jul 09
0
Is this surprising behavior of tkimage.create function a bug?
tkimage.create function can read some images but can't read the other
images.
We can reproduce it by running the code below.
-------------------------------------------------------------------------------------
library(tcltk)
library(magick)
# works fine
tmp <- tempfile(fileext = ".gif")
image_write(logo, tmp)
image_tcl <- tkimage.create("photo",
2010 Feb 03
0
"read.table" and "scan" skips newlines which "count.fields" finds in Thai textfile
Hi there,
I have some problems reading in a Thai text.
Some of the newlines are skipped.
(see the contents of my file below)
R>count.fields ("my.txt", sep='\n', quote="")
[1] 1 1 1
Three lines with one item each, right?
R> scan("my.txt", what="", sep="\t", quote="")
Read 2 items
[1] "?\x83???\x88
2008 May 21
1
rawToChar(raw(0))
Hi,
right now we have (on R v2.7.0 patched (2008-04-23 r45466)) that:
> rawToChar(raw(0))
[1] ""
> rawToChar(raw(0), multiple=TRUE)
character(0)
Is this intended or should both return character(0)? Personally, I
would prefer that an empty input vector returns an empty output
vector. Same should then apply to charToRaw(), but right now we get:
> x <- character(0)
>
2011 Sep 29
3
grep and PCRE fun
Hello,
I think I've found a bug in the C function do_grep located in
src/main/grep.c. It seems to affect both the latest revisions of
R-2-13-branch and trunk when compiling R without optimizations and
with it's own version of pcre located in src/extra, at least on ubuntu
10.04.
According to the pcre_exec API (I presume the later versions), the
ovecsize argument must be a multiple of 3 ,
2018 Feb 17
1
writeLines argument useBytes = TRUE still making conversions
Of course, right after writing this e-mail I tested on my Windows
machine and did not see what I expected:
> charToRaw(before)
[1] c3 a9
> charToRaw(after)
[1] e9
so obviously I'm misunderstanding something as well.
Best,
Kevin
On Sat, Feb 17, 2018 at 2:19 PM, Kevin Ushey <kevinushey at gmail.com> wrote:
> From my understanding, translation is implied in this line of ?file
2012 Jul 20
1
subRaw?
Hello, All:
Do you know of any capability to substitute more then one byte in
an object of class Raw?
Consider the following:
> let4 <- paste(letters[1:4], collapse='')
> (let4Raw <- charToRaw(let4))
[1] 61 62 63 64
> (let. <- sub('bc', '--', let4Raw))
[1] "61" "62" "63" "64"
> # no
2008 Apr 28
4
R 2.7.0, match() and strings containing \0 - bug?
Hi,
A piece of my code that uses readBin() to read a certain file type is
behaving strangely with R 2.7.0. This seems to be because of a failure
to match() strings after using rawToChar() when the original was
terminated with a "\0" character. Direct equality testing with ==
still works as expected. I can reproduce this as follows:
> x <- "foo"
> y <-
2018 Feb 15
2
writeLines argument useBytes = TRUE still making conversions
On Thu, Feb 15, 2018 at 11:19 AM, Kevin Ushey <kevinushey at gmail.com> wrote:
> I suspect your UTF-8 string is being stripped of its encoding before
> write, and so assumed to be in the system native encoding, and then
> re-encoded as UTF-8 when written to the file. You can see something
> similar with:
>
> > tmp <- '?'
> > tmp <- iconv(tmp,
2007 Aug 31
1
Consistency of serialize(): please enlighten me
Hi,
I am puzzled with serialize(). It comes down generating identical
hash codes for (apparently) identical objects using digest::digest(),
which in turn relies on serialize(). Here is an example illustration
the issue:
ser <- function(object, ...) {
list(
names = names(object),
namesRaw = charToRaw(names(object)),
ser = serialize(names(object), connection=NULL, ascii=FALSE)
2023 Apr 13
1
Split String in regex while Keeping Delimiter
Dear Emily,
Using a look-behind solves the split problem in this case. (Note: Using
Regex is in most/many cases the simplest solution.)
str = c("leucocyten + gramnegatieve staven +++ grampositieve staven ++",
"leucocyten ? grampositieve coccen +")
tokens = strsplit(str, "(?<=[-+])\\s++", perl=TRUE)
PROBLEM
The current expression does NOT work for a different
2007 Sep 20
1
Paging MEETME_RECORDINGFILE Variable
I am having a weird issue with setting the recording file for the
Page app. Here is some quick background info
I have a macro that pages all my phones:
[macro-pageall]
; Context for paging all devices.
; This will search the sip table in the realtime database
; for all phones that start with a number. That number is
; passed to this macro as ${ARG1}.
;
; ARG1 = The
2013 May 08
1
getting corrupted data when using readBin() after seek() on a gzfile connection
Hi,
I'm running into more issues when reading data from a gzfile connection.
If I read the data sequentially with successive calls to readBin(), the
data I get looks ok. But if I call seek() between the successive calls
to readBin(), I get corrupted data.
Here is a (hopefully) reproducible example. See my sessionInfo() at the
end (I'm not on Windows, where, according to the man page,
2018 Feb 15
2
writeLines argument useBytes = TRUE still making conversions
I think this behavior is inconsistent with the documentation:
tmp <- '?'
tmp <- iconv(tmp, to = 'UTF-8')
print(Encoding(tmp))
print(charToRaw(tmp))
tmpfilepath <- tempfile()
writeLines(tmp, con = file(tmpfilepath, encoding = 'UTF-8'), useBytes = TRUE)
[1] "UTF-8"
[1] c3 a9
Raw text as hex: c3 83 c2 a9
If I switch to useBytes = FALSE, then
2010 Dec 07
3
More elegant magnitude method
I have a need to find the order of number to get a scaling parameter as a
power of 10. I have a function that works *so far*, but it is ugly and
probably buggy. In the interest of avoiding code-based outliers in my
data, I thought I would ask if anyone here has a better way.
> scl <- function(x){
+ length(charToRaw(format(trunc(x), scientific = F)))-1}
> a <- 123456789
> b <-
2009 Aug 29
2
RFE: vectorize URLdecode
In R 2.9.2,
> URLdecode(c("a%20b", "b%20c"))
[1] "a b"
Warning message:
In charToRaw(URL) : argument should be a character vector of length 1
all but the first element will be ignored
Could URLdecode be modified to actually process all elements of the vector, not
just the first?
Thanks in advance
2011 Jul 21
2
User input(unknown name and number of files)
Dear all,
I need your help as I was not able to find out the solution.
The thing is-
I am having a code which is reading file with this code-
df=read.table("Case2.pileup",fill=T,sep="\t",colClasses="character")
but as am making a tool so that user can use it and can do analysis on his file.But the name of the file will not be Case2.pileup and I want to use this
2017 Sep 14
2
special latin1 do not print as glyphs in current devel on windows
This is a follow-up on my initial posts regarding character encodings on
Windows (https://stat.ethz.ch/pipermail/r-devel/2017-August/074728.html)
and Patrick Perry's reply
(https://stat.ethz.ch/pipermail/r-devel/2017-August/074830.html) in
particular (thank you for the links and the bug report!). My initial
posts were quite chaotic (and partly wrong), so I am trying to clear
things up a
2018 Jul 29
2
odd behavior of names
Bugzilla issue 16101 describes another first-list-name-printed-differently
oddity
with the Windows GUI version of R:
> a <- "One is \u043E\u0434\u0438\u043D\nTwo is \u0434\u0432\u0430\n"
> Encoding(a) # expect "UTF-8"
[1] "UTF-8"
> sapply(strsplit(a, "\n")[[1]], charToRaw)[c(1,1,2)]
$`One is ????`
[1] 4f 6e 65 20 69 73 20 d0 be d0 b4 d0
[13] b8