thr3ads.net - search: "utf8toint"

2008 Jan 02

1

WG: AW: Another problem with encoding

...1 > >> names(attributes(spss[,'Y6'])[[1]][14]) >> > [1] "?rzte Chirurgie" > >> iconv(names(attributes(spss[,'Y6'])[[1]][14]), "UTF-8", "LATIN1") >> > [1] NA > >> utf8ToInt(names(attributes(spss[,'Y6'])[[1]][14])) >> > Fehler in utf8ToInt(names(attributes(spss[, "Y6"])[[1]][14])) : > invalid UTF-8 string > > > Cheers, > Matthias > > > -----Urspr?ngliche Nachricht----- > Von: Peter Dalgaard [mailto:p.dalg...

Another problem with encoding

2007 Dec 30

1

Another problem with encoding

Hi I've imported an spss-file using read.spss. One variable has value like '?rzte'. I thought this is UTF-8 encoded, but it is not (as the results of iconv and utf8ToInt suggest). Is there any way to find out how these spss-values are encoded? Regards, Matthias

ASCIIfy() - a proposal for package:tools

2014 Apr 15

1

ASCIIfy() - a proposal for package:tools

...- paste0("\\x", raw) else if(length(raw)==1 && bytes==2) # 8-bit to \u0000 ascii <- paste0("\\u", chartr(" ","0",formatC(as.character(raw),width=4))) else if(length(raw)==2 && bytes==1) # 16-bit to \x00, if possible if(utf8ToInt(char) <= 255) ascii <- paste0("\\x", format.hexmode(utf8ToInt(char))) else { ascii <- fallback; warning(char, " could not be converted to 1 byte")} else if(length(raw)==2 && bytes==2) # UTF-8 to \u0000 ascii <- paste0("\\...

Writing escaped unicode

2012 Dec 11

2

Writing escaped unicode

I'd like to write unicode strings using the "\u" escape syntax. According to the documentation, print.default or encodeString will escape unicode using the \u convention. In practice, I can't make it work. > b="Unicode character: \ufffd" > print.default(b) [1] "Unicode character: ?" > encodeString(b) [1] "Unicode character: ?" I want to

R 3.4.3 is released

2017 Nov 30

2

R 3.4.3 is released

...maximum of 0x10FFFF. (This aligns with the current RFC3629.) * Fix calling of methods on S4 generics that dispatch on ... when the call contains .... * Following Unicode 'Corrigendum 9', the UTF-8 representations of U+FFFE and U+FFFF are now regarded as valid by utf8ToInt(). * range(c(TRUE, NA), finite = TRUE) and similar no longer return NA. (Reported by Lukas Stadler.) * The self starting function attr(SSlogis, "initial") now also works when the y values have exact minimum zero and is slightly changed in general, behaving symm...

R 3.4.3 is released

2017 Nov 30

2

R 3.4.3 is released

...maximum of 0x10FFFF. (This aligns with the current RFC3629.) * Fix calling of methods on S4 generics that dispatch on ... when the call contains .... * Following Unicode 'Corrigendum 9', the UTF-8 representations of U+FFFE and U+FFFF are now regarded as valid by utf8ToInt(). * range(c(TRUE, NA), finite = TRUE) and similar no longer return NA. (Reported by Lukas Stadler.) * The self starting function attr(SSlogis, "initial") now also works when the y values have exact minimum zero and is slightly changed in general, behaving symm...

\U with more than 4 digits returns the wrong character

2014 Dec 04

4

\U with more than 4 digits returns the wrong character

If I type a character using \U syntax that has more than 4 digits, I get the wrong character. For example, "\U1d4d0" should print a mathematical bold script capital A. See http://www.fileformat.info/info/unicode/char/1d4d0/index.htm On my machine, it prints the Hangul character corresponding to "\Ud4d0" http://www.fileformat.info/info/unicode/char/d4d0/index.htm It seems

R 3.4.3 is released

2017 Dec 01

0

R 3.4.3 is released

...> the current RFC3629.) > > * Fix calling of methods on S4 generics that dispatch on ... when > the call contains .... > > * Following Unicode 'Corrigendum 9', the UTF-8 representations of > U+FFFE and U+FFFF are now regarded as valid by utf8ToInt(). > > * range(c(TRUE, NA), finite = TRUE) and similar no longer return > NA. (Reported by Lukas Stadler.) > > * The self starting function attr(SSlogis, "initial") now also > works when the y values have exact minimum zero and is slightly >...

R 2.11.1 is released

2010 May 31

1

R 2.11.1 is released

...o longer reads past 'len' bytes (unlikely to be a problem except in user code). (PR#14246) o On systems without any default LD_LIBRARY_PATH (not even /usr/local/lib), [DY]LIB_LIBRARY_PATH is now set without a trailing colon. (PR#13637) o More efficient utf8ToInt() on long multi-byte strings with many multi-byte characters. (PR#14262) o aggregate.ts() gave platform-depedent results due to rounding error for ndeltat != 1. o package.skeleton() sometimes failed to fix filenames for .R or .Rd files to start with an alphanu...

R 2.11.1 is released

2010 May 31

1

R 2.11.1 is released

...o longer reads past 'len' bytes (unlikely to be a problem except in user code). (PR#14246) o On systems without any default LD_LIBRARY_PATH (not even /usr/local/lib), [DY]LIB_LIBRARY_PATH is now set without a trailing colon. (PR#13637) o More efficient utf8ToInt() on long multi-byte strings with many multi-byte characters. (PR#14262) o aggregate.ts() gave platform-depedent results due to rounding error for ndeltat != 1. o package.skeleton() sometimes failed to fix filenames for .R or .Rd files to start with an alphanu...

(no subject)

2013 Apr 11

2

(no subject)

Dear all, Is there a quick and easy way of converting utf characters to the \uxxxx form (necessary e.g. for packages)? I mean something working like this: > utf2uxxxx("õäöü") [1] "\u00f5\u00e4\u00f6\u00fc" It is easy to program but perhaps someone already has implemented this. (I couldn't find anything useful from searches incl RSiteSearch). Thanks in advance, Kenn

what's this character?

2012 Nov 29

3

what's this character?

Hi list, I've encounter this problem (see below).? I know it's particularly R-related and it's easy to get by but it still bothers me a lot.? It looks the last character of "N.C. " is a space to me, but it's clearly not.? Can someone tell me a way to figure out what character is in the last position. Thanks! Tao > levels(dat$flag)[3] [1] "N.C.?" >

R2.11.0 - rasterImage() and barplot fill-patterns

2010 Apr 22

2

R2.11.0 - rasterImage() and barplot fill-patterns

...#39;curl' but not 'wget', but also for some > hard-to-access URLs. > > o In Rd, \eqn and \deqn will render in HTML (and convert to text) > upper- and lower-case Greek letters (entered as \alpha ...), > \ldots, \dots, \ge and \le. > > o utf8ToInt() and intToUtf8() now map NA inputs to NA outputs. > > o file() has a new argument 'raw' which may help if it is used > with something other than a regular file, e.g. a character device. > > o New function strtoi(), a wrapper for the C function strtol. > &...

R 2.11.0 is released

2010 Apr 22

0

R 2.11.0 is released

...platforms which have 'curl' but not 'wget', but also for some hard-to-access URLs. o In Rd, \eqn and \deqn will render in HTML (and convert to text) upper- and lower-case Greek letters (entered as \alpha ...), \ldots, \dots, \ge and \le. o utf8ToInt() and intToUtf8() now map NA inputs to NA outputs. o file() has a new argument 'raw' which may help if it is used with something other than a regular file, e.g. a character device. o New function strtoi(), a wrapper for the C function strtol. o as.octmode() and...

R 2.11.0 is released

2010 Apr 22

0

R 2.11.0 is released

...platforms which have 'curl' but not 'wget', but also for some hard-to-access URLs. o In Rd, \eqn and \deqn will render in HTML (and convert to text) upper- and lower-case Greek letters (entered as \alpha ...), \ldots, \dots, \ge and \le. o utf8ToInt() and intToUtf8() now map NA inputs to NA outputs. o file() has a new argument 'raw' which may help if it is used with something other than a regular file, e.g. a character device. o New function strtoi(), a wrapper for the C function strtol. o as.octmode() and...

R-2.2.0 is released

2005 Oct 06

0

R-2.2.0 is released

...parisons (based on a patch contributed by Fernando Henrique Ferraz P. da Rosa). o New functions URLencode() and URLdecode(), particularly for use with file:// URLs. These are used by e.g. browse.env(), download.file(), download.packages() and various help() print methods. o Functions utf8ToInt() and intToUtf8() to work with UTF-8 encoded character strings (irrespective of locale or OS-level UTF-8 support). o [dqp]wilcox and wilcox.test work better with one very large sample size and an extreme first argument. o write() has a new argument 'sep'. o write.csv[2] no...

R-2.2.0 is released

2005 Oct 06

0

R-2.2.0 is released

...parisons (based on a patch contributed by Fernando Henrique Ferraz P. da Rosa). o New functions URLencode() and URLdecode(), particularly for use with file:// URLs. These are used by e.g. browse.env(), download.file(), download.packages() and various help() print methods. o Functions utf8ToInt() and intToUtf8() to work with UTF-8 encoded character strings (irrespective of locale or OS-level UTF-8 support). o [dqp]wilcox and wilcox.test work better with one very large sample size and an extreme first argument. o write() has a new argument 'sep'. o write.csv[2] no...

search for: utf8toint