Dirk Enzmann
2005-Sep-20 21:44 UTC
[R] Problem with read.spss() and as.data.frame(), or: alternative to subset()?
Trying to select a subset of cases (rows of data) I encountered several
problems:
Firstly, because I did not read the help to read.spss() thoroughly
enough, I treated the data read as a data frame. For example,
dr2000 <- read.spss('myfile.sav')
d <- subset(dr2000,RBINZ99 > 0)
and thus received an error message (Object "RBINZ99" not found),
because
dr2000 is not a data.frame but a list (shown by class(dr2000)).
d <- subset(dr2000,dr2000$RBINZ99)
didn' help either, because now d is empty (dim = NULL).
Thus, I tried to use the option "to.data.frame=T" of read.spss():
dr2000 <- read.spss('myfile.sav',to.data.frame=T)
However, now R "crashes" ('R for Windows GUI front-end has found
an
error and must be closed') (the error message is in German).
Finally, I tried again using read.spss() without the option
'to.data.frame=T' (as before) and tried to convert dr2000 to a data
frame by using
d <- as.data.frame(dr2000)
However, R crashes again (with the same error message).
Of course, I could use SPSS first and save only the cases with RBINZ99 >
0, but this is not always possible (all users of the data must have SPSS
available and we have to use different selection criteria). Is there
another possibility to solve the problem by using R? I want to select
certain rows (cases) based on the values of one "variable" of dr2000,
but keep all columns (variables) - although dr2000 is not a data frame?
And: R should not crash but rather give a warning.
------------------------
R version 2.1.1 Patched (2005-07-15)
Package Foreign Version 0.8-10
Operating system: Windows XP Professional (5.1 (Build 2600))
CPU: Pentium Model 2 Stepping 9
RAM: 512 MB
*************************************************
Dr. Dirk Enzmann
Institute of Criminal Sciences
Dept. of Criminology
Edmund-Siemers-Allee 1
D-20146 Hamburg
Germany
phone: +49-040-42838.7498 (office)
+49-040-42838.4591 (Billon)
fax: +49-040-42838.2344
email: dirk.enzmann at jura.uni-hamburg.de
www:
http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html
Dirk Enzmann
2005-Sep-21 11:18 UTC
[R] Problem with read.spss() and as.data.frame(), or: alternative to subset()?
The selection problem can be solved by
dr2000=read.spss('myfile')
d=lapply(dr2000,subset,dr2000$RBINZ99 > 0)
however, there is still the problem that R crashes when using
d = as.data.frame(dr2000)
or
dr2000=read.spss('myfile',to.data.frame=T)
Any suggestions why? I checked whether all components of dr2000 are of
the same length and the sort of object of each component. This is not
the problem: Each component has the same length (9232) and there are 66
components of the class 'character', 981 of the class 'factor',
and 479
of the class 'numeric'.
> Trying to select a subset of cases (rows of data) I encountered several
> problems:
>
> Firstly, because I did not read the help to read.spss() thoroughly
> enough, I treated the data read as a data frame. For example,
>
> dr2000 <- read.spss('myfile.sav')
> d <- subset(dr2000,RBINZ99 > 0)
>
> and thus received an error message (Object "RBINZ99" not found),
because
> dr2000 is not a data.frame but a list (shown by class(dr2000)).
>
> d <- subset(dr2000,dr2000$RBINZ99 > 0)
>
> didn' help either, because now d is empty (dim = NULL).
>
> Thus, I tried to use the option "to.data.frame=T" of read.spss():
>
> dr2000 <- read.spss('myfile.sav',to.data.frame=T)
>
> However, now R "crashes" ('R for Windows GUI front-end has
found an
> error and must be closed') (the error message is in German).
>
> Finally, I tried again using read.spss() without the option
> 'to.data.frame=T' (as before) and tried to convert dr2000 to a data
> frame by using
>
> d <- as.data.frame(dr2000)
>
> However, R crashes again (with the same error message).
>
> Of course, I could use SPSS first and save only the cases with RBINZ99 >
> 0, but this is not always possible (all users of the data must have SPSS
> available and we have to use different selection criteria). Is there
> another possibility to solve the problem by using R? I want to select
> certain rows (cases) based on the values of one "variable" of
dr2000,
> but keep all columns (variables) - although dr2000 is not a data frame?
>
> And: R should not crash but rather give a warning.
>
> ------------------------
> R version 2.1.1 Patched (2005-07-15)
> Package Foreign Version 0.8-10
>
> Operating system: Windows XP Professional (5.1 (Build 2600))
> CPU: Pentium Model 2 Stepping 9
> RAM: 512 MB
*************************************************
Dr. Dirk Enzmann
Institute of Criminal Sciences
Dept. of Criminology
Edmund-Siemers-Allee 1
D-20146 Hamburg
Germany
phone: +49-040-42838.7498 (office)
+49-040-42838.4591 (Billon)
fax: +49-040-42838.2344
email: dirk.enzmann at jura.uni-hamburg.de
www:
http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html