Kjetil Brinchmann Halvorsen
2004-Nov-06 15:48 UTC
[Rd] foreign(read.spss) in rw2000 and re2001beta
I encountered something strange with read.spss (package foreign, version 0.7 with R2.0.0 and version 0.8 with R2.0.1 beta, windows XP) I made a test file test.sav with SPSS version 11.5.1 containing only one numeric variable, with a value label for one value not occuring in the file. According to ?read.spss this should result in a factor, but it results in all NA. Using the argument use.value.labels=FALSE, everything is read as expected. test <- read.spss("test.sav", to.=TRUE) test > only NA's Kjetil -- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra
On Sat, 6 Nov 2004, Kjetil Brinchmann Halvorsen wrote:> I encountered something strange with read.spss (package foreign, version > 0.7 with R2.0.0 and > version 0.8 with R2.0.1 beta, windows XP)Please clarify: the same thing in both versions (read.spss has not changed between them)?> I made a test file test.sav with SPSS version 11.5.1 > containing only one numeric variable, with a value label > for one value not occuring in the file. According to ?read.spss > this should result in a factor, but it results in all NA. Using the > argument > use.value.labels=FALSE, everything is read as expected.Can you make that file available please?> test <- read.spss("test.sav", to.=TRUE) > test > only NA's > > Kjetil > >-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Kjetil Brinchmann Halvorsen <kjetil@acelerate.com> writes:> I made a test file test.sav with SPSS version 11.5.1 > containing only one numeric variable, with a value label > for one value not occuring in the file. According to ?read.spss > this should result in a factor, but it results in all NA.Er, what do you mean "all NA"? If the only factor level corresponds to a value that isn't present, wouldn't you expect to get a factor with one level and all values missing? What does str() say about the resulting object? -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907
On Sat, 6 Nov 2004, Peter Dalgaard wrote:> Kjetil Brinchmann Halvorsen <kjetil@acelerate.com> writes: > >> I made a test file test.sav with SPSS version 11.5.1 >> containing only one numeric variable, with a value label >> for one value not occuring in the file. According to ?read.spss >> this should result in a factor, but it results in all NA. >It should result in a factor all of whose values are missing. And it does. I have modified read.spss (but not committed the changes yet) so that it does not create a factor when there are missing levels. The problem is that SPSS uses value labels for two different things: for factors and for labelling a subset of values (eg different types of missing). It is hard for R to guess which the user intends. You can always set use.value.labels=FALSE. You still get the value labels read in and then you can decide what to do with them. -thomas