Displaying 20 results from an estimated 1000 matches similar to: "Consistency of serialize(): please enlighten me"
2008 May 21
1
rawToChar(raw(0))
Hi,
right now we have (on R v2.7.0 patched (2008-04-23 r45466)) that:
> rawToChar(raw(0))
[1] ""
> rawToChar(raw(0), multiple=TRUE)
character(0)
Is this intended or should both return character(0)? Personally, I
would prefer that an empty input vector returns an empty output
vector. Same should then apply to charToRaw(), but right now we get:
> x <- character(0)
>
2008 Apr 28
4
R 2.7.0, match() and strings containing \0 - bug?
Hi,
A piece of my code that uses readBin() to read a certain file type is
behaving strangely with R 2.7.0. This seems to be because of a failure
to match() strings after using rawToChar() when the original was
terminated with a "\0" character. Direct equality testing with ==
still works as expected. I can reproduce this as follows:
> x <- "foo"
> y <-
2012 Jul 20
1
subRaw?
Hello, All:
Do you know of any capability to substitute more then one byte in
an object of class Raw?
Consider the following:
> let4 <- paste(letters[1:4], collapse='')
> (let4Raw <- charToRaw(let4))
[1] 61 62 63 64
> (let. <- sub('bc', '--', let4Raw))
[1] "61" "62" "63" "64"
> # no
2018 Feb 17
1
writeLines argument useBytes = TRUE still making conversions
Of course, right after writing this e-mail I tested on my Windows
machine and did not see what I expected:
> charToRaw(before)
[1] c3 a9
> charToRaw(after)
[1] e9
so obviously I'm misunderstanding something as well.
Best,
Kevin
On Sat, Feb 17, 2018 at 2:19 PM, Kevin Ushey <kevinushey at gmail.com> wrote:
> From my understanding, translation is implied in this line of ?file
2013 May 08
1
getting corrupted data when using readBin() after seek() on a gzfile connection
Hi,
I'm running into more issues when reading data from a gzfile connection.
If I read the data sequentially with successive calls to readBin(), the
data I get looks ok. But if I call seek() between the successive calls
to readBin(), I get corrupted data.
Here is a (hopefully) reproducible example. See my sessionInfo() at the
end (I'm not on Windows, where, according to the man page,
2009 May 10
2
In C, a fast way to slice a vector?
Hello,
Suppose in the following code,
PROTECT(sr = R_tryEval( .... ))
sr is a RAWSXP vector. I wish to return another RAWSXP starting at
position 13 onwards (base=0).
I could create another RAWSXP of the correct length and then memcpy
the required bytes and length to this new one.
However is there a more efficient method?
Regards
Saptarshi Guha
2012 Dec 22
1
Character Variable in X axis scatter plot
I am very new to R statistics.
Have installed R-2.15.2 ; Rcmdr 1.9-2 ; RStudio 0.97.237 on Debian Squeeze and also windows7
I can Import from Excel File OK
.Workbook <- loadWorkbook("/media/4C90-B739/Oct13-Dec21Bsl.xls")
JJData <- readWorksheet(.Workbook, "Oct13-Dec21Bsl")
remove(.Workbook)
have a data frame with following.
DATEEVENT
2018 Feb 15
2
writeLines argument useBytes = TRUE still making conversions
On Thu, Feb 15, 2018 at 11:19 AM, Kevin Ushey <kevinushey at gmail.com> wrote:
> I suspect your UTF-8 string is being stripped of its encoding before
> write, and so assumed to be in the system native encoding, and then
> re-encoded as UTF-8 when written to the file. You can see something
> similar with:
>
> > tmp <- '?'
> > tmp <- iconv(tmp,
2016 Sep 05
2
How to print UTF-8 encoded strings from a C routine to R's output?
Dear R experts,
It seems that Rprintf has to be used to print from a C routine to guarantee
to write to R?s output according to
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Printing.
However if a string is UTF-8 encoded, non-ASCII characters (e.g., the
infinity symbol http://www.fileformat.info/info/unicode/char/221e/index.htm)
are misprinted.
Is this an unsupported feature or is
2023 Apr 13
1
Split String in regex while Keeping Delimiter
Dear Emily,
Using a look-behind solves the split problem in this case. (Note: Using
Regex is in most/many cases the simplest solution.)
str = c("leucocyten + gramnegatieve staven +++ grampositieve staven ++",
"leucocyten ? grampositieve coccen +")
tokens = strsplit(str, "(?<=[-+])\\s++", perl=TRUE)
PROBLEM
The current expression does NOT work for a different
2013 May 01
1
Windows, format.POSIXct and character encodings
Hi all,
In what encoding does format.POSIXct return its output? It doesn't
seem to be utf-8:
Sys.setlocale("LC_ALL", "Japanese_Japan.932")
times <- c("1970-01-01 01:00:00 UTC", "1970-02-02 22:00:00 UTC")
ampm <- format(as.POSIXct(times), format = "%p")
x <- gsub(">", "*", paste(ampm, collapse =
2018 Feb 15
2
writeLines argument useBytes = TRUE still making conversions
I think this behavior is inconsistent with the documentation:
tmp <- '?'
tmp <- iconv(tmp, to = 'UTF-8')
print(Encoding(tmp))
print(charToRaw(tmp))
tmpfilepath <- tempfile()
writeLines(tmp, con = file(tmpfilepath, encoding = 'UTF-8'), useBytes = TRUE)
[1] "UTF-8"
[1] c3 a9
Raw text as hex: c3 83 c2 a9
If I switch to useBytes = FALSE, then
2010 Dec 07
3
More elegant magnitude method
I have a need to find the order of number to get a scaling parameter as a
power of 10. I have a function that works *so far*, but it is ugly and
probably buggy. In the interest of avoiding code-based outliers in my
data, I thought I would ask if anyone here has a better way.
> scl <- function(x){
+ length(charToRaw(format(trunc(x), scientific = F)))-1}
> a <- 123456789
> b <-
2009 Aug 29
2
RFE: vectorize URLdecode
In R 2.9.2,
> URLdecode(c("a%20b", "b%20c"))
[1] "a b"
Warning message:
In charToRaw(URL) : argument should be a character vector of length 1
all but the first element will be ignored
Could URLdecode be modified to actually process all elements of the vector, not
just the first?
Thanks in advance
2014 Feb 04
0
capture.output(): Using a rawConnection() [linear] instead of textConnection() [exponential]?
I've noticed that the processing time for the default capture.output()
grows exponentially in the number of characters outputted/captured.
The default settings sinks to a temporary textConnection(). When
instead sinking to a rawConnection(), the processing time becomes
linear. See below example and attached PNG figure [also at
2011 Jul 21
2
User input(unknown name and number of files)
Dear all,
I need your help as I was not able to find out the solution.
The thing is-
I am having a code which is reading file with this code-
df=read.table("Case2.pileup",fill=T,sep="\t",colClasses="character")
but as am making a tool so that user can use it and can do analysis on his file.But the name of the file will not be Case2.pileup and I want to use this
2010 Jan 14
1
memDecompress and zlib compressed base64 encoded string
Hi,
I have zlib compressed strings (example is attached) and would like to
decompress them using memDecompress ...
I try this:
> connection <- file("compressed.txt","r")
> compressed <- readLines(connection)
> memDecompress(as.raw(compressed),type="g")
Error in memDecompress(as.raw(compressed), type = "g") :
internal error -3 in
2017 Sep 14
2
special latin1 do not print as glyphs in current devel on windows
This is a follow-up on my initial posts regarding character encodings on
Windows (https://stat.ethz.ch/pipermail/r-devel/2017-August/074728.html)
and Patrick Perry's reply
(https://stat.ethz.ch/pipermail/r-devel/2017-August/074830.html) in
particular (thank you for the links and the bug report!). My initial
posts were quite chaotic (and partly wrong), so I am trying to clear
things up a
2011 Oct 27
2
Consistant test for NAs in a factor when exclude = NULL?
Dear folks?
Is there a function to correctly find (and count) the NAs in a factor when
exclude=NULL, regardless of whether their origin is in the original data or
by subsequent assignment?
In example number 1 below, where NAs are assigned by is.na()<-, testing the
factor with is.na() finds the correct number of NAs. In example number 2,
where the NAs are from the data, neither is.na(), ==NA,
2018 Jul 29
2
odd behavior of names
Bugzilla issue 16101 describes another first-list-name-printed-differently
oddity
with the Windows GUI version of R:
> a <- "One is \u043E\u0434\u0438\u043D\nTwo is \u0434\u0432\u0430\n"
> Encoding(a) # expect "UTF-8"
[1] "UTF-8"
> sapply(strsplit(a, "\n")[[1]], charToRaw)[c(1,1,2)]
$`One is ????`
[1] 4f 6e 65 20 69 73 20 d0 be d0 b4 d0
[13] b8