thr3ads.net - similar to: "Consistency of serialize(): please enlighten me"

Displaying 20 results from an estimated 1000 matches similar to: "Consistency of serialize(): please enlighten me"

2008 May 21

rawToChar(raw(0))

Hi, right now we have (on R v2.7.0 patched (2008-04-23 r45466)) that: > rawToChar(raw(0)) [1] "" > rawToChar(raw(0), multiple=TRUE) character(0) Is this intended or should both return character(0)? Personally, I would prefer that an empty input vector returns an empty output vector. Same should then apply to charToRaw(), but right now we get: > x <- character(0) >

R 2.7.0, match() and strings containing \0 - bug?

2008 Apr 28

R 2.7.0, match() and strings containing \0 - bug?

Hi, A piece of my code that uses readBin() to read a certain file type is behaving strangely with R 2.7.0. This seems to be because of a failure to match() strings after using rawToChar() when the original was terminated with a "\0" character. Direct equality testing with == still works as expected. I can reproduce this as follows: > x <- "foo" > y <-

subRaw?

2012 Jul 20

subRaw?

Hello, All: Do you know of any capability to substitute more then one byte in an object of class Raw? Consider the following: > let4 <- paste(letters[1:4], collapse='') > (let4Raw <- charToRaw(let4)) [1] 61 62 63 64 > (let. <- sub('bc', '--', let4Raw)) [1] "61" "62" "63" "64" > # no

writeLines argument useBytes = TRUE still making conversions

2018 Feb 17

writeLines argument useBytes = TRUE still making conversions

Of course, right after writing this e-mail I tested on my Windows machine and did not see what I expected: > charToRaw(before) [1] c3 a9 > charToRaw(after) [1] e9 so obviously I'm misunderstanding something as well. Best, Kevin On Sat, Feb 17, 2018 at 2:19 PM, Kevin Ushey <kevinushey at gmail.com> wrote: > From my understanding, translation is implied in this line of ?file

getting corrupted data when using readBin() after seek() on a gzfile connection

2013 May 08

getting corrupted data when using readBin() after seek() on a gzfile connection

Hi, I'm running into more issues when reading data from a gzfile connection. If I read the data sequentially with successive calls to readBin(), the data I get looks ok. But if I call seek() between the successive calls to readBin(), I get corrupted data. Here is a (hopefully) reproducible example. See my sessionInfo() at the end (I'm not on Windows, where, according to the man page,

In C, a fast way to slice a vector?

2009 May 10

In C, a fast way to slice a vector?

Hello, Suppose in the following code, PROTECT(sr = R_tryEval( .... )) sr is a RAWSXP vector. I wish to return another RAWSXP starting at position 13 onwards (base=0). I could create another RAWSXP of the correct length and then memcpy the required bytes and length to this new one. However is there a more efficient method? Regards Saptarshi Guha

Character Variable in X axis scatter plot

2012 Dec 22

Character Variable in X axis scatter plot

I am very new to R statistics. Have installed R-2.15.2 ; Rcmdr 1.9-2 ; RStudio 0.97.237 on Debian Squeeze and also windows7 I can Import from Excel File OK .Workbook <- loadWorkbook("/media/4C90-B739/Oct13-Dec21Bsl.xls") JJData <- readWorksheet(.Workbook, "Oct13-Dec21Bsl") remove(.Workbook) have a data frame with following. DATEEVENT

writeLines argument useBytes = TRUE still making conversions

2018 Feb 15

writeLines argument useBytes = TRUE still making conversions

On Thu, Feb 15, 2018 at 11:19 AM, Kevin Ushey <kevinushey at gmail.com> wrote: > I suspect your UTF-8 string is being stripped of its encoding before > write, and so assumed to be in the system native encoding, and then > re-encoded as UTF-8 when written to the file. You can see something > similar with: > > > tmp <- '?' > > tmp <- iconv(tmp,

How to print UTF-8 encoded strings from a C routine to R's output?

2016 Sep 05

How to print UTF-8 encoded strings from a C routine to R's output?

Dear R experts, It seems that Rprintf has to be used to print from a C routine to guarantee to write to R?s output according to https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Printing. However if a string is UTF-8 encoded, non-ASCII characters (e.g., the infinity symbol http://www.fileformat.info/info/unicode/char/221e/index.htm) are misprinted. Is this an unsupported feature or is

Split String in regex while Keeping Delimiter

2023 Apr 13

Split String in regex while Keeping Delimiter

Dear Emily, Using a look-behind solves the split problem in this case. (Note: Using Regex is in most/many cases the simplest solution.) str = c("leucocyten + gramnegatieve staven +++ grampositieve staven ++", "leucocyten ? grampositieve coccen +") tokens = strsplit(str, "(?<=[-+])\\s++", perl=TRUE) PROBLEM The current expression does NOT work for a different

Windows, format.POSIXct and character encodings

2013 May 01

Windows, format.POSIXct and character encodings

Hi all, In what encoding does format.POSIXct return its output? It doesn't seem to be utf-8: Sys.setlocale("LC_ALL", "Japanese_Japan.932") times <- c("1970-01-01 01:00:00 UTC", "1970-02-02 22:00:00 UTC") ampm <- format(as.POSIXct(times), format = "%p") x <- gsub(">", "*", paste(ampm, collapse =

writeLines argument useBytes = TRUE still making conversions

2018 Feb 15

writeLines argument useBytes = TRUE still making conversions

I think this behavior is inconsistent with the documentation: tmp <- '?' tmp <- iconv(tmp, to = 'UTF-8') print(Encoding(tmp)) print(charToRaw(tmp)) tmpfilepath <- tempfile() writeLines(tmp, con = file(tmpfilepath, encoding = 'UTF-8'), useBytes = TRUE) [1] "UTF-8" [1] c3 a9 Raw text as hex: c3 83 c2 a9 If I switch to useBytes = FALSE, then

More elegant magnitude method

2010 Dec 07

More elegant magnitude method

I have a need to find the order of number to get a scaling parameter as a power of 10. I have a function that works *so far*, but it is ugly and probably buggy. In the interest of avoiding code-based outliers in my data, I thought I would ask if anyone here has a better way. > scl <- function(x){ + length(charToRaw(format(trunc(x), scientific = F)))-1} > a <- 123456789 > b <-

RFE: vectorize URLdecode

2009 Aug 29

RFE: vectorize URLdecode

In R 2.9.2, > URLdecode(c("a%20b", "b%20c")) [1] "a b" Warning message: In charToRaw(URL) : argument should be a character vector of length 1 all but the first element will be ignored Could URLdecode be modified to actually process all elements of the vector, not just the first? Thanks in advance

capture.output(): Using a rawConnection() [linear] instead of textConnection() [exponential]?

2014 Feb 04

capture.output(): Using a rawConnection() [linear] instead of textConnection() [exponential]?

I've noticed that the processing time for the default capture.output() grows exponentially in the number of characters outputted/captured. The default settings sinks to a temporary textConnection(). When instead sinking to a rawConnection(), the processing time becomes linear. See below example and attached PNG figure [also at

User input(unknown name and number of files)

2011 Jul 21

User input(unknown name and number of files)

Dear all, I need your help as I was not able to find out the solution. The thing is- I am having a code which is reading file with this code- df=read.table("Case2.pileup",fill=T,sep="\t",colClasses="character") but as am making a tool so that user can use it and can do analysis on his file.But the name of the file will not be Case2.pileup and I want to use this

memDecompress and zlib compressed base64 encoded string

2010 Jan 14

memDecompress and zlib compressed base64 encoded string

Hi, I have zlib compressed strings (example is attached) and would like to decompress them using memDecompress ... I try this: > connection <- file("compressed.txt","r") > compressed <- readLines(connection) > memDecompress(as.raw(compressed),type="g") Error in memDecompress(as.raw(compressed), type = "g") : internal error -3 in

special latin1 do not print as glyphs in current devel on windows

2017 Sep 14

special latin1 do not print as glyphs in current devel on windows

This is a follow-up on my initial posts regarding character encodings on Windows (https://stat.ethz.ch/pipermail/r-devel/2017-August/074728.html) and Patrick Perry's reply (https://stat.ethz.ch/pipermail/r-devel/2017-August/074830.html) in particular (thank you for the links and the bug report!). My initial posts were quite chaotic (and partly wrong), so I am trying to clear things up a

Consistant test for NAs in a factor when exclude = NULL?

2011 Oct 27

Consistant test for NAs in a factor when exclude = NULL?

Dear folks? Is there a function to correctly find (and count) the NAs in a factor when exclude=NULL, regardless of whether their origin is in the original data or by subsequent assignment? In example number 1 below, where NAs are assigned by is.na()<-, testing the factor with is.na() finds the correct number of NAs. In example number 2, where the NAs are from the data, neither is.na(), ==NA,

odd behavior of names

2018 Jul 29

odd behavior of names

Bugzilla issue 16101 describes another first-list-name-printed-differently oddity with the Windows GUI version of R: > a <- "One is \u043E\u0434\u0438\u043D\nTwo is \u0434\u0432\u0430\n" > Encoding(a) # expect "UTF-8" [1] "UTF-8" > sapply(strsplit(a, "\n")[[1]], charToRaw)[c(1,1,2)] $`One is ????` [1] 4f 6e 65 20 69 73 20 d0 be d0 b4 d0 [13] b8

similar to: Consistency of serialize(): please enlighten me