Displaying 17 results from an estimated 17 matches for "utf8toint".
2008 Jan 02
1
WG: AW: Another problem with encoding
...1
>
>> names(attributes(spss[,'Y6'])[[1]][14])
>>
> [1] "?rzte Chirurgie"
>
>> iconv(names(attributes(spss[,'Y6'])[[1]][14]), "UTF-8", "LATIN1")
>>
> [1] NA
>
>> utf8ToInt(names(attributes(spss[,'Y6'])[[1]][14]))
>>
> Fehler in utf8ToInt(names(attributes(spss[, "Y6"])[[1]][14])) :
> invalid UTF-8 string
>
>
> Cheers,
> Matthias
>
>
> -----Urspr?ngliche Nachricht-----
> Von: Peter Dalgaard [mailto:p.dalg...
2007 Dec 30
1
Another problem with encoding
Hi
I've imported an spss-file using read.spss. One variable has value like '?rzte'. I thought this is UTF-8 encoded, but it is not
(as the results of iconv and utf8ToInt suggest). Is there any way to find out how these spss-values are encoded?
Regards,
Matthias
2014 Apr 15
1
ASCIIfy() - a proposal for package:tools
...- paste0("\\x", raw)
else if(length(raw)==1 && bytes==2) # 8-bit to \u0000
ascii <- paste0("\\u", chartr(" ","0",formatC(as.character(raw),width=4)))
else if(length(raw)==2 && bytes==1) # 16-bit to \x00, if possible
if(utf8ToInt(char) <= 255)
ascii <- paste0("\\x", format.hexmode(utf8ToInt(char)))
else {
ascii <- fallback; warning(char, " could not be converted to 1 byte")}
else if(length(raw)==2 && bytes==2) # UTF-8 to \u0000
ascii <- paste0("\\...
2012 Dec 11
2
Writing escaped unicode
I'd like to write unicode strings using the "\u" escape syntax. According to the documentation, print.default or encodeString will escape unicode using the \u convention. In practice, I can't make it work.
> b="Unicode character: \ufffd"
> print.default(b)
[1] "Unicode character: ?"
> encodeString(b)
[1] "Unicode character: ?"
I want to
2017 Nov 30
2
R 3.4.3 is released
...maximum of 0x10FFFF. (This aligns with
the current RFC3629.)
* Fix calling of methods on S4 generics that dispatch on ... when
the call contains ....
* Following Unicode 'Corrigendum 9', the UTF-8 representations of
U+FFFE and U+FFFF are now regarded as valid by utf8ToInt().
* range(c(TRUE, NA), finite = TRUE) and similar no longer return
NA. (Reported by Lukas Stadler.)
* The self starting function attr(SSlogis, "initial") now also
works when the y values have exact minimum zero and is slightly
changed in general, behaving symm...
2017 Nov 30
2
R 3.4.3 is released
...maximum of 0x10FFFF. (This aligns with
the current RFC3629.)
* Fix calling of methods on S4 generics that dispatch on ... when
the call contains ....
* Following Unicode 'Corrigendum 9', the UTF-8 representations of
U+FFFE and U+FFFF are now regarded as valid by utf8ToInt().
* range(c(TRUE, NA), finite = TRUE) and similar no longer return
NA. (Reported by Lukas Stadler.)
* The self starting function attr(SSlogis, "initial") now also
works when the y values have exact minimum zero and is slightly
changed in general, behaving symm...
2014 Dec 04
4
\U with more than 4 digits returns the wrong character
If I type a character using \U syntax that has more than 4 digits, I
get the wrong character. For example,
"\U1d4d0"
should print a mathematical bold script capital A. See
http://www.fileformat.info/info/unicode/char/1d4d0/index.htm
On my machine, it prints the Hangul character corresponding to
"\Ud4d0"
http://www.fileformat.info/info/unicode/char/d4d0/index.htm
It seems
2017 Dec 01
0
R 3.4.3 is released
...> the current RFC3629.)
>
> * Fix calling of methods on S4 generics that dispatch on ... when
> the call contains ....
>
> * Following Unicode 'Corrigendum 9', the UTF-8 representations of
> U+FFFE and U+FFFF are now regarded as valid by utf8ToInt().
>
> * range(c(TRUE, NA), finite = TRUE) and similar no longer return
> NA. (Reported by Lukas Stadler.)
>
> * The self starting function attr(SSlogis, "initial") now also
> works when the y values have exact minimum zero and is slightly
>...
2010 May 31
1
R 2.11.1 is released
...o longer reads past 'len' bytes
(unlikely to be a problem except in user code). (PR#14246)
o On systems without any default LD_LIBRARY_PATH (not even
/usr/local/lib), [DY]LIB_LIBRARY_PATH is now set without a
trailing colon. (PR#13637)
o More efficient utf8ToInt() on long multi-byte strings with
many multi-byte characters. (PR#14262)
o aggregate.ts() gave platform-depedent results due to rounding
error for ndeltat != 1.
o package.skeleton() sometimes failed to fix filenames for .R or
.Rd files to start with an alphanu...
2010 May 31
1
R 2.11.1 is released
...o longer reads past 'len' bytes
(unlikely to be a problem except in user code). (PR#14246)
o On systems without any default LD_LIBRARY_PATH (not even
/usr/local/lib), [DY]LIB_LIBRARY_PATH is now set without a
trailing colon. (PR#13637)
o More efficient utf8ToInt() on long multi-byte strings with
many multi-byte characters. (PR#14262)
o aggregate.ts() gave platform-depedent results due to rounding
error for ndeltat != 1.
o package.skeleton() sometimes failed to fix filenames for .R or
.Rd files to start with an alphanu...
2013 Apr 11
2
(no subject)
Dear all,
Is there a quick and easy way of converting utf characters to the \uxxxx
form (necessary e.g. for packages)? I mean something working like this:
> utf2uxxxx("õäöü")
[1] "\u00f5\u00e4\u00f6\u00fc"
It is easy to program but perhaps someone already has implemented this. (I
couldn't find anything useful from searches incl RSiteSearch).
Thanks in advance,
Kenn
2012 Nov 29
3
what's this character?
Hi list,
I've encounter this problem (see below).? I know it's particularly R-related and it's easy to get by but it still bothers me a lot.?
It looks the last character of "N.C. " is a space to me, but it's clearly not.? Can someone tell me a way to figure out what character is in the last position.
Thanks!
Tao
> levels(dat$flag)[3]
[1] "N.C.?"
>
2010 Apr 22
2
R2.11.0 - rasterImage() and barplot fill-patterns
...#39;curl' but not 'wget', but also for some
> hard-to-access URLs.
>
> o In Rd, \eqn and \deqn will render in HTML (and convert to text)
> upper- and lower-case Greek letters (entered as \alpha ...),
> \ldots, \dots, \ge and \le.
>
> o utf8ToInt() and intToUtf8() now map NA inputs to NA outputs.
>
> o file() has a new argument 'raw' which may help if it is used
> with something other than a regular file, e.g. a character device.
>
> o New function strtoi(), a wrapper for the C function strtol.
>
&...
2010 Apr 22
0
R 2.11.0 is released
...platforms which have 'curl' but not 'wget', but also for some
hard-to-access URLs.
o In Rd, \eqn and \deqn will render in HTML (and convert to text)
upper- and lower-case Greek letters (entered as \alpha ...),
\ldots, \dots, \ge and \le.
o utf8ToInt() and intToUtf8() now map NA inputs to NA outputs.
o file() has a new argument 'raw' which may help if it is used
with something other than a regular file, e.g. a character device.
o New function strtoi(), a wrapper for the C function strtol.
o as.octmode() and...
2010 Apr 22
0
R 2.11.0 is released
...platforms which have 'curl' but not 'wget', but also for some
hard-to-access URLs.
o In Rd, \eqn and \deqn will render in HTML (and convert to text)
upper- and lower-case Greek letters (entered as \alpha ...),
\ldots, \dots, \ge and \le.
o utf8ToInt() and intToUtf8() now map NA inputs to NA outputs.
o file() has a new argument 'raw' which may help if it is used
with something other than a regular file, e.g. a character device.
o New function strtoi(), a wrapper for the C function strtol.
o as.octmode() and...
2005 Oct 06
0
R-2.2.0 is released
...parisons
(based on a patch contributed by Fernando Henrique Ferraz P. da Rosa).
o New functions URLencode() and URLdecode(), particularly for use
with file:// URLs. These are used by e.g. browse.env(),
download.file(), download.packages() and various help() print
methods.
o Functions utf8ToInt() and intToUtf8() to work with UTF-8
encoded character strings (irrespective of locale or OS-level
UTF-8 support).
o [dqp]wilcox and wilcox.test work better with one very large sample
size and an extreme first argument.
o write() has a new argument 'sep'.
o write.csv[2] no...
2005 Oct 06
0
R-2.2.0 is released
...parisons
(based on a patch contributed by Fernando Henrique Ferraz P. da Rosa).
o New functions URLencode() and URLdecode(), particularly for use
with file:// URLs. These are used by e.g. browse.env(),
download.file(), download.packages() and various help() print
methods.
o Functions utf8ToInt() and intToUtf8() to work with UTF-8
encoded character strings (irrespective of locale or OS-level
UTF-8 support).
o [dqp]wilcox and wilcox.test work better with one very large sample
size and an extreme first argument.
o write() has a new argument 'sep'.
o write.csv[2] no...