Christine Christmann
2008-Apr-15 09:51 UTC
[R] How can I import user-defined missings from Spss?
Hi, It works for me to import spss datasets via library(foreign) with read.spss or via library Hmisc by (spss.get). But no matter which way I do import the data, user-defined missings from Spss are always lost. (it makes no difference if there are a single value, a range, or any combination of them. They are always ignored). Is there any way in R to find out if any value was user-defined missing in Spss or not? Even to keep the information as an attribute would suit me fine, or to keep them as a string character like "miss" would be even better. To transform them into "NA" as the sysmis data from Spss is transformed automatically, would be an other alternative. Unfortunately I don't know if any of these options are possible. Could you help me out? Let me give you an example: Preconditions: You need to have spss on you computer to generate the spss data. You need to generate the folder C:/tmp to save the spss file. As you can see I work with windows. */1) Generate the SpssData: */data. DATA LIST LIST /age (f2) sport (f2). BEGIN DATA 22, 1 40, 2 69, 1 19, 2 -99, 9 END DATA. */description. missing values age (LO thru 0). missing values sport (9). var label age "age". var label sport "Do you like sports" value label sport 1 "yes" 2 "no" 3 "don't know". *frequencies in Spss. freq age sport. save outfile = "C:\tmp\test.sav". *-----------------------------------------------------------------------------------------. 2) Import the Spss Data in R. Via Hmisc or foreign - both work fine. #import Spssdata in R spssfile <- "C:/tmp/test.sav" #via Hmisc library(Hmisc) Signs <- c("_") mydata1 <- spss.get(spssfile,lowernames=TRUE, allow=Signs) #via foreign library(foreign) mydata2 <- read.spss(spssfile,use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE) #freq in r describe(mydata1) describe(mydata2) *-----------------------------------------------------------------------------------------. Have a look at the two variables age and sport. In spss the values (-99) in age is a missing, as well as the value (9) in sports. As you can see - the information about the missings in R is lost. What can I do? Many Thanks Christine Christmann
Prof Brian Ripley
2008-Apr-15 10:13 UTC
[R] How can I import user-defined missings from Spss?
You have already had a reply to a version of this (posted from another address) at https://stat.ethz.ch/pipermail/r-help/2008-April/159342.html . 'Kind souls' are likely to get exasperated when their help is unacknowledged. You need SPSS and Windows to reproduce this, and this is the R forum. To fulfil the footer of the message you need to make available the spss save file. On Tue, 15 Apr 2008, Christine Christmann wrote:> Hi, > > It works for me to import spss datasets via library(foreign) with read.spss or via library Hmisc by (spss.get). > But no matter which way I do import the data, user-defined missings from Spss are always lost. > (it makes no difference if there are a single value, a range, or any combination of them. They are always ignored). > Is there any way in R to find out if any value was user-defined missing in Spss or not? > Even to keep the information as an attribute would suit me fine, or to keep them as a string character like "miss" would be even better. > To transform them into "NA" as the sysmis data from Spss is transformed automatically, would be an other alternative. > > Unfortunately I don't know if any of these options are possible. Could you help me out? > > Let me give you an example: > Preconditions: You need to have spss on you computer to generate the spss data. > You need to generate the folder C:/tmp to save the spss file. As you can see I work with windows. > > */1) Generate the SpssData: > */data. > DATA LIST LIST /age (f2) sport (f2). > BEGIN DATA > 22, 1 > 40, 2 > 69, 1 > 19, 2 > -99, 9 > END DATA. > > > */description. > missing values age (LO thru 0). > missing values sport (9). > var label age "age". > var label sport "Do you like sports" > value label sport > 1 "yes" > 2 "no" > 3 "don't know". > > *frequencies in Spss. > freq age sport. > > > save outfile = "C:\tmp\test.sav". > *-----------------------------------------------------------------------------------------. > > > 2) Import the Spss Data in R. Via Hmisc or foreign - both work fine. > > #import Spssdata in R > spssfile <- "C:/tmp/test.sav" > > #via Hmisc > library(Hmisc) > Signs <- c("_") > mydata1 <- spss.get(spssfile,lowernames=TRUE, allow=Signs) > > #via foreign > library(foreign) > mydata2 <- read.spss(spssfile,use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE) > > #freq in r > describe(mydata1) > describe(mydata2) > > > *-----------------------------------------------------------------------------------------. > Have a look at the two variables age and sport. In spss the values (-99) in age is a missing, as well as the value (9) in sports. > As you can see - the information about the missings in R is lost. What can I do? > > > Many Thanks Christine Christmann > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Christine Christmann
2008-Apr-15 20:59 UTC
[R] How can I import user-defined missings from Spss?
Ok, if I have to make the spss file available, then I hope an attachment is fine. I really would appreciate if any 'kind soul' would give me a push into the right direction to solve this problem. The spss file contains only two variables and five cases. For one case both values are defined as missings. In R all cases are valid. Any information about missings is lost. What can I do to keep any missing information? Cheers Christine *-----------------------------------------------------------------------------------------.> > > > > > to import the Spss Data in R. Via Hmisc or foreign - both work fine. > > > > #import Spssdata in R > > spssfile <- "PathToTheSavedSpssFile" > > > > #via Hmisc > > library(Hmisc) > > Signs <- c("_") > > mydata1 <- spss.get(spssfile,lowernames=TRUE, allow=Signs) > > > > #via foreign > > library(foreign) > > mydata2 <- read.spss(spssfile,use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE) > > > > #freq in r > > describe(mydata1) > > describe(mydata2)> -----Urspr?ngliche Nachricht----- > Von: "Prof Brian Ripley" <ripley at stats.ox.ac.uk> > Gesendet: 15.04.08 12:13:45 > An: Christine Christmann <christinechristmann at web.de> > CC: r-help at r-project.org > Betreff: Re: [R] How can I import user-defined missings from Spss?> > You have already had a reply to a version of this (posted from another > address) at https://stat.ethz.ch/pipermail/r-help/2008-April/159342.html . > 'Kind souls' are likely to get exasperated when their help is > unacknowledged. > > You need SPSS and Windows to reproduce this, and this is the R forum. To > fulfil the footer of the message you need to make available the spss save > file. > > On Tue, 15 Apr 2008, Christine Christmann wrote: > > > Hi, > > > > It works for me to import spss datasets via library(foreign) with read.spss or via library Hmisc by (spss.get). > > But no matter which way I do import the data, user-defined missings from Spss are always lost. > > (it makes no difference if there are a single value, a range, or any combination of them. They are always ignored). > > Is there any way in R to find out if any value was user-defined missing in Spss or not? > > Even to keep the information as an attribute would suit me fine, or to keep them as a string character like "miss" would be even better. > > To transform them into "NA" as the sysmis data from Spss is transformed automatically, would be an other alternative. > > > > Unfortunately I don't know if any of these options are possible. Could you help me out? > > > > Let me give you an example: > > Preconditions: You need to have spss on you computer to generate the spss data. > > You need to generate the folder C:/tmp to save the spss file. As you can see I work with windows. > > > > */1) Generate the SpssData: > > */data. > > DATA LIST LIST /age (f2) sport (f2). > > BEGIN DATA > > 22, 1 > > 40, 2 > > 69, 1 > > 19, 2 > > -99, 9 > > END DATA. > > > > > > */description. > > missing values age (LO thru 0). > > missing values sport (9). > > var label age "age". > > var label sport "Do you like sports" > > value label sport > > 1 "yes" > > 2 "no" > > 3 "don't know". > > > > *frequencies in Spss. > > freq age sport. > > > > > > save outfile = "C:\tmp\test.sav". > > *-----------------------------------------------------------------------------------------. > > > > > > 2) Import the Spss Data in R. Via Hmisc or foreign - both work fine. > > > > #import Spssdata in R > > spssfile <- "C:/tmp/test.sav" > > > > #via Hmisc > > library(Hmisc) > > Signs <- c("_") > > mydata1 <- spss.get(spssfile,lowernames=TRUE, allow=Signs) > > > > #via foreign > > library(foreign) > > mydata2 <- read.spss(spssfile,use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE) > > > > #freq in r > > describe(mydata1) > > describe(mydata2) > > > > > > *-----------------------------------------------------------------------------------------. > > Have a look at the two variables age and sport. In spss the values (-99) in age is a missing, as well as the value (9) in sports. > > As you can see - the information about the missings in R is lost. What can I do? > > > > > > Many Thanks Christine Christmann > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 >