Yingfu Xie
2008-Aug-15 09:54 UTC
[R] How to substitute special characters within a data frame?
Hello all, I have a data frame in R, imported from an excel file in Swedish. The original file contains several columns that have special characters, such as \¨{a}, \¨{o}, and so on. After import such special characters are represented in the data frame by "\\345", "\\366" etc (don't ask me why). For example, a word "Hårkan" becomes ''H\\345rkan". Now my question is if it is possible to substitute those "H\\345rkan" by "Haarkan" or simply "Harkan" in R, ideally by finding those "\\345" and then replacing. Thanks in advance, Yingfu [[alternative HTML version deleted]]
Prof Brian Ripley
2008-Aug-15 10:43 UTC
[R] How to substitute special characters within a data frame?
You've not told us the 'at a minimum' information requested in the posting guide. What OS? What locale? And how did you 'import'? But here's a guess. If you change \\345 to \345, it should render correctly in a Latin-1 locale:> "H\345rkan"[1] "H?rkan" If this a UTF-8 locale, convert it> iconv("H\345rkan", "latin1")[1] "H?rkan" and if you have an unsuitable locale, e.g. a Chinese one> iconv("H\345rkan", "latin1", "ASCII//TRANSLIT")[1] "Harkan" or> gsub("\\\\345", "aa", "H\\345rkan")[1] "Haarkan" On Fri, 15 Aug 2008, Yingfu Xie wrote:> Hello all, > > I have a data frame in R, imported from an excel file in Swedish. The > original file contains several columns that have special characters, > such as \?{a}, \?{o}, and so on. After import such special characters > are represented in the data frame by "\\345", "\\366" etc (don't ask me > why). For example, a word "H?rkan" becomes ''H\\345rkan".That's odd: the quotes do not match. We do need to ask you 'why', as we have nothing reproducible here.> Now my question is if it is possible to substitute those "H\\345rkan" by > "Haarkan" or simply "Harkan" in R, ideally by finding those "\\345" and > then replacing. > > Thanks in advance, > Yingfu > > [[alternative HTML version deleted]]Please don't (as the posting guide asked). Properly encoded plain text has a chance of working. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Henrique Dallazuanna
2008-Aug-15 11:20 UTC
[R] How to substitute special characters within a data frame?
Try this: gsub("\\\\345", "a", "H\\345rkan") But see: cat("H\345rkan\n") On Fri, Aug 15, 2008 at 6:54 AM, Yingfu Xie <Yingfu.Xie at sekon.slu.se> wrote:> Hello all, > > I have a data frame in R, imported from an excel file in Swedish. The original file contains several columns that have special characters, such as \?{a}, \?{o}, and so on. After import such special characters are represented in the data frame by "\\345", "\\366" etc (don't ask me why). For example, a word "H?rkan" becomes ''H\\345rkan". > > Now my question is if it is possible to substitute those "H\\345rkan" by "Haarkan" or simply "Harkan" in R, ideally by finding those "\\345" and then replacing. > > Thanks in advance, > Yingfu > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
Possibly Parallel Threads
- Can I ask for the C code inside an R function using .C?
- Numerical integration of a two dimensional function over a disk
- minimization a quadratic form with some coef fixed and some constrained
- always NaN after some running in R, but all fine in S-plus
- Processed (with 1 errors): Fix broken submitters (double encoded)