Many thanks Prof. and Duncan!
Iconv worked like a charm together with CP1252 as the windows
encoding, and now all the text shows up correctly
Because the data frame also contained factors with levels that had
swedish characters, i ended up writing a small function for converting
the encoding of everything inside a dataframe in one go. It is a bit
slow, but hopefully someone else will find it useful in the future:
iconv.data.frame<-function(df,...){
? ? ?df.names<-iconv(names(df),...)
? ? ?df.rownames<-iconv(rownames(df),...)
? ? ?names(df)<-df.names
? ? ?rownames(df)<-df.rownames
? ? ?df.list<-lapply(df,function(x){
? ? ? ? ? ?
?if(class(x)=="factor"){x<-factor(iconv(as.character(x),...))}else
? ? ? ? ? ? ?if(class(x)=="character"){x<-iconv(x,...)}else{x}
? ? ? })
? ? ?df.new<-do.call("data.frame",df.list)
? ? ?return(df.new)
}
Best regards,
Gustaf
On Sun, May 2, 2010 at 8:36 PM, Prof Brian Ripley <ripley at
stats.ox.ac.uk> wrote:> On Sun, 2 May 2010, Duncan Murdoch wrote:
>
>> Gustaf Rydevik wrote:
>>>
>>> Hi all,
>>>
>>> I hope that there is someone that can help me out here.
>>> I am trying to load() a workspace on os x (R 2.11.0) that was saved
in
>>> windows XP (R 2.9). In that workspace, there's a data.frame
with names
>>> that contain swedish characters. These characters become garbled,
>>> which is a major problem.
>>> >From the R windows FAQ, I read:
>>>
>>> "Note though that character data in a workspace will be in a
>>> particular encoding that is not recorded in the workspace, so
>>> workspaces containing non-ASCII character data may not be
>>> interchangeable even on the same OS. Since R marks character data
when
>>> it knows it to be in UTF-8 or Latin-1 (including its Windows
superset,
>>> CP1252), strings in those encodings are likely to be transferred
>>> correctly: fortunately this covers most of the common cases (Mac OS
X
>>> normally uses UTF-8, and Linux users are likely to use UTF-8 or
>>> perhaps Latin-1 (which used to be used for English)). "
>>>
>>> Apparently, my case is not the most common one, and I don't
know why.
>>> I've been trying to dig into the load() function, but since it
uses a
>>> lot of .Internal functions, I get stuck there.
>>> I've also tried doing options(encoding="latin1"),
which doesn't seem
>>> to change anything.
>>>
>>
>> You can't change the encoding when you load, but you can convert
the
>> encoding later (using iconv()) if you know what encoding it is. ?A good
>> guess for a file created on Windows in my locale is "latin1",
but it's not
>> certain, and I don't know what is commonly used on Windows in a
Swedish
>> locale.
>
> CP1252 (which is actually what you will get too).
>
>>
>> If you have an example where you know the correct version of the string
>> and you can show us what you're getting, together with charToRaw()
applied
>> to it, someone will probably be able to make a guess at the encoding.
>>
>> Duncan Murdoch
>>
>>
>>> And now I'm stuck. Any suggestions on where to look?
>>> I've run into this issue twice before. The first time I managed
to get
>>> it solved, but can't remember how (perhaps a .Rprofile setting
>>> somewhere?).
>>> The second time, I mailed R-Sig-Mac, got some tips that
unfortunately
>>> did not lead anywhere, and subsequently gave up. I hope third
time's a
>>> charm!
>>>
>>> Many thanks in advance,
>>> Gustaf
>>>
>>>
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> --
> Brian D. Ripley, ? ? ? ? ? ? ? ? ?ripley at stats.ox.ac.uk
> Professor of Applied Statistics, ?http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, ? ? ? ? ? ? Tel: ?+44 1865 272861 (self)
> 1 South Parks Road, ? ? ? ? ? ? ? ? ? ? +44 1865 272866 (PA)
> Oxford OX1 3TG, UK ? ? ? ? ? ? ? ?Fax: ?+44 1865 272595
>
--
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik
--
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik