search for: utf8conv

Displaying 1 result from an estimated 1 matches for "utf8conv".

2014 Jul 28
1
Parsing and deparsing of escaped unicode characters
...8" nchar(x) #9, seems OK cat(deparse(x)) "I like <U+5BFF><U+53F8>" As a result, the code does not parse() back into the proper unicode characters. I am currently using a regular expression to convert the output of deparse into something that parse() (and json) supports: utf8conv <- function(x) { gsub("<U\\+([0-9A-F]{4})>","\\\\u\\1",x) } > src <- utf8conv(src) > y <- parse(text=src)[[1]] > identical(x, y) [1] TRUE However this is suboptimal because it introduces a big performance overhead for large text. Several things are un...