Hi all. I have an SPSS file that I'm loading into R with the Hmisc spss.get function. The trouble is that the SPSS file uses the Windows-1252 character set (which I think is the default for SPSS on Windows) instead of plain-ol' Latin-1, and since spss.get doesn't allow me to pass the "reencode" option to read.spss, any characters in Windows-1252 that are not a part of Latin-1 (such as curly quotes, en-dashes, and a handful of others) come into R looking like this: "Don\x92t know". Now if I read that SPSS file in with read.spss and include "reencode='Windows-1252'", those characters convert to UTF-8 just fine, yielding "Don?t know". But then, of cource, I don't get the niceties of spss.get, such as the "labels" attributes on the columns. So my question is, how can I either pass the "reencode='Windows-1252'" option through to read.spss, or how can I make spss.get default to reencoding from Windows-1252 instead of Latin-1? Thanks ?Dan
Hello, If all you need is the column labels, read.spss does return them as an attribute of the list/data.frame. You can write an extractor function: variable.labels <- function(x) attr(x, "variable.labels") and then dfr <- read.spss(file.choose(), reencode = 'Windows-1252', to.data.frame = TRUE) variable.labels(dfr) Hope this helps, Rui Barradas Em 08-09-2012 04:17, Dan Delaney escreveu:> Hi all. I have an SPSS file that I'm loading into R with the Hmisc spss.get function. The trouble is that the SPSS file uses the Windows-1252 character set (which I think is the default for SPSS on Windows) instead of plain-ol' Latin-1, and since spss.get doesn't allow me to pass the "reencode" option to read.spss, any characters in Windows-1252 that are not a part of Latin-1 (such as curly quotes, en-dashes, and a handful of others) come into R looking like this: "Don\x92t know". Now if I read that SPSS file in with read.spss and include "reencode='Windows-1252'", those characters convert to UTF-8 just fine, yielding "Don?t know". But then, of cource, I don't get the niceties of spss.get, such as the "labels" attributes on the columns. > > So my question is, how can I either pass the "reencode='Windows-1252'" option through to read.spss, or how can I make spss.get default to reencoding from Windows-1252 instead of Latin-1? > > Thanks > ?Dan > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Sep 8, 2012, at 05:17 , Dan Delaney wrote:> Hi all. I have an SPSS file that I'm loading into R with the Hmisc spss.get function. The trouble is that the SPSS file uses the Windows-1252 character set (which I think is the default for SPSS on Windows) instead of plain-ol' Latin-1, and since spss.get doesn't allow me to pass the "reencode" option to read.spss, any characters in Windows-1252 that are not a part of Latin-1 (such as curly quotes, en-dashes, and a handful of others) come into R looking like this: "Don\x92t know". Now if I read that SPSS file in with read.spss and include "reencode='Windows-1252'", those characters convert to UTF-8 just fine, yielding "Don?t know". But then, of cource, I don't get the niceties of spss.get, such as the "labels" attributes on the columns. > > So my question is, how can I either pass the "reencode='Windows-1252'" option through to read.spss, or how can I make spss.get default to reencoding from Windows-1252 instead of Latin-1?Would it work to do the conversion afterwards?> iconv("\x92", from="CP1252")[1] "?" -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
If you have control of the SPSS file and you set Unicode on in SPSS, the sav file will be encoded in Unicode - utf-8. With any sav file from SPSS V15 or later, the encoding is written into the file header. If it isn't Unicode, it is determined by the SPSS locale setting. -- View this message in context: http://r.789695.n4.nabble.com/Can-I-make-spss-get-reencode-from-Windows-1252-tp4642563p4642839.html Sent from the R help mailing list archive at Nabble.com.