Dimitri Liakhovitski
2015-Nov-12 16:56 UTC
[R] "haven" - read_spss: How to avoid extracting value labels instead of long labels?
Hello!
I don't have an example file, but I think my question should be clear
without it.
I have an SPSS file. I read it in using 'haven':
library(haven)
spss1 <- read_spss("SPSS_Example.sav")
I created a function that extracts the long labels (in SPSS -
"Label"):
fix_labels <- function(x, TextIfMissing) {
val <- attr(x, "label")
if (is.null(val)) TextIfMissing else val
}
longlabels <- sapply(spss1, fix_labels, TextIfMissing = "NO LABLE IN
SPSS")
This function is supposed to create a vector of long labels and
usually it does, e.g.:
str(longlabels)
Named chr [1:64] "Serial number" ...
- attr(*, "names")= chr [1:64] "Respondent_Serial"
"weight" "r7_1" "r7_2" ...
However, I just got an SPSS file with 92 columns and ran exactly the
same function on it. Now, I am getting not a vector, but a list
str(longlabels)
List of 92
$ VEHRATED : chr "VEHICLE RATED"
$ RESPID : chr "RESPONDENT ID"
$ RESPID8 : chr "8 DIGIT RESPONDENT NUMBER"
An observation about the structure of longlabels here: those columns
that do NOT have a long lable in SPSS but DO have Values (value
labels) - for them my function grabs their value labels, so that now
my long label is recorded as a numeric vector with names, e.g.:
$ AWARE2 : Named num [1:2] 1 2
..- attr(*, "names")= chr [1:2] "VERY/SOMEWHAT FAMILIAR"
"NOT AT ALL FAMILIAR"
Question: How could I avoid the extraction of the Value Labels for the
columns that have no long labels?
Thank you very much!
--
Dimitri Liakhovitski
Dimitri Liakhovitski
2015-Nov-12 22:55 UTC
[R] "haven" - read_spss: How to avoid extracting value labels instead of long labels?
Looks like a little bug in 'haven':
When I actually look at the attributes of one variable that has no
long label in SPSS but has Value Labels, I am getting:
attr(spss1$WAVE, "label")
NULL
But when I sapply my function longlabels to my data frame and ask it
to print the long labels for each column, for the same column "WAVE" I
am getting - instead of NULL:
NULL
VERY/SOMEWHAT FAMILIAR NOT AT ALL FAMILIAR
1 2
This is, of course, incorrect, because it grabs the next attribute
(which one? And replaces NULL with it).
Any suggestions?
Thanks!
On Thu, Nov 12, 2015 at 11:56 AM, Dimitri Liakhovitski
<dimitri.liakhovitski at gmail.com> wrote:> Hello!
>
> I don't have an example file, but I think my question should be clear
> without it.
> I have an SPSS file. I read it in using 'haven':
>
> library(haven)
> spss1 <- read_spss("SPSS_Example.sav")
>
> I created a function that extracts the long labels (in SPSS -
"Label"):
>
> fix_labels <- function(x, TextIfMissing) {
> val <- attr(x, "label")
> if (is.null(val)) TextIfMissing else val
> }
> longlabels <- sapply(spss1, fix_labels, TextIfMissing = "NO LABLE
IN SPSS")
>
> This function is supposed to create a vector of long labels and
> usually it does, e.g.:
>
> str(longlabels)
> Named chr [1:64] "Serial number" ...
> - attr(*, "names")= chr [1:64] "Respondent_Serial"
"weight" "r7_1" "r7_2" ...
>
> However, I just got an SPSS file with 92 columns and ran exactly the
> same function on it. Now, I am getting not a vector, but a list
>
> str(longlabels)
> List of 92
> $ VEHRATED : chr "VEHICLE RATED"
> $ RESPID : chr "RESPONDENT ID"
> $ RESPID8 : chr "8 DIGIT RESPONDENT NUMBER"
>
> An observation about the structure of longlabels here: those columns
> that do NOT have a long lable in SPSS but DO have Values (value
> labels) - for them my function grabs their value labels, so that now
> my long label is recorded as a numeric vector with names, e.g.:
>
> $ AWARE2 : Named num [1:2] 1 2
> ..- attr(*, "names")= chr [1:2] "VERY/SOMEWHAT
FAMILIAR" "NOT AT ALL FAMILIAR"
>
> Question: How could I avoid the extraction of the Value Labels for the
> columns that have no long labels?
>
> Thank you very much!
> --
> Dimitri Liakhovitski
--
Dimitri Liakhovitski
Dimitri Liakhovitski
2015-Nov-13 01:37 UTC
[R] "haven" - read_spss: How to avoid extracting value labels instead of long labels?
I have to rephrase my question again - it's clearly a small bug in
haven. Here is what it is about:
If I have a column in SPSS that has BOTH a long label and value
labels, then everything works fine - I access one with 'label' and
another with 'labels':
attr(spss1$MYVAR, "label")
[1] "LONG LABEL"
attr(spss1$MYVAR, "labels")
DEFINITELY CONSIDER PROBABLY CONSIDER PROBABLY NOT
CONSIDER DEFINITELY NOT CONSIDER
1 2
3 4
However, if I have a column that has no long label and ONLY value
labels, then it's not working properly:
> attr(spss1$MYVAR, "label")
VERY/SOMEWHAT FAMILIAR NOT AT ALL FAMILIAR
1 2> attr(spss1$MYVAR, "labels")
VERY/SOMEWHAT FAMILIAR NOT AT ALL FAMILIAR
1 2
And I actually need to be able to identify if label is empty.
Thank you for looking into it!
Dimitri
On Thu, Nov 12, 2015 at 5:55 PM, Dimitri Liakhovitski
<dimitri.liakhovitski at gmail.com> wrote:> Looks like a little bug in 'haven':
>
> When I actually look at the attributes of one variable that has no
> long label in SPSS but has Value Labels, I am getting:
> attr(spss1$WAVE, "label")
> NULL
>
> But when I sapply my function longlabels to my data frame and ask it
> to print the long labels for each column, for the same column
"WAVE" I
> am getting - instead of NULL:
> NULL
> VERY/SOMEWHAT FAMILIAR NOT AT ALL FAMILIAR
> 1 2
>
> This is, of course, incorrect, because it grabs the next attribute
> (which one? And replaces NULL with it).
> Any suggestions?
> Thanks!
>
>
>
>
> On Thu, Nov 12, 2015 at 11:56 AM, Dimitri Liakhovitski
> <dimitri.liakhovitski at gmail.com> wrote:
>> Hello!
>>
>> I don't have an example file, but I think my question should be
clear
>> without it.
>> I have an SPSS file. I read it in using 'haven':
>>
>> library(haven)
>> spss1 <- read_spss("SPSS_Example.sav")
>>
>> I created a function that extracts the long labels (in SPSS -
"Label"):
>>
>> fix_labels <- function(x, TextIfMissing) {
>> val <- attr(x, "label")
>> if (is.null(val)) TextIfMissing else val
>> }
>> longlabels <- sapply(spss1, fix_labels, TextIfMissing = "NO
LABLE IN SPSS")
>>
>> This function is supposed to create a vector of long labels and
>> usually it does, e.g.:
>>
>> str(longlabels)
>> Named chr [1:64] "Serial number" ...
>> - attr(*, "names")= chr [1:64] "Respondent_Serial"
"weight" "r7_1" "r7_2" ...
>>
>> However, I just got an SPSS file with 92 columns and ran exactly the
>> same function on it. Now, I am getting not a vector, but a list
>>
>> str(longlabels)
>> List of 92
>> $ VEHRATED : chr "VEHICLE RATED"
>> $ RESPID : chr "RESPONDENT ID"
>> $ RESPID8 : chr "8 DIGIT RESPONDENT NUMBER"
>>
>> An observation about the structure of longlabels here: those columns
>> that do NOT have a long lable in SPSS but DO have Values (value
>> labels) - for them my function grabs their value labels, so that now
>> my long label is recorded as a numeric vector with names, e.g.:
>>
>> $ AWARE2 : Named num [1:2] 1 2
>> ..- attr(*, "names")= chr [1:2] "VERY/SOMEWHAT
FAMILIAR" "NOT AT ALL FAMILIAR"
>>
>> Question: How could I avoid the extraction of the Value Labels for the
>> columns that have no long labels?
>>
>> Thank you very much!
>> --
>> Dimitri Liakhovitski
>
>
>
> --
> Dimitri Liakhovitski
--
Dimitri Liakhovitski