Dimitri Liakhovitski
2015-Nov-12 16:56 UTC
[R] "haven" - read_spss: How to avoid extracting value labels instead of long labels?
Hello! I don't have an example file, but I think my question should be clear without it. I have an SPSS file. I read it in using 'haven': library(haven) spss1 <- read_spss("SPSS_Example.sav") I created a function that extracts the long labels (in SPSS - "Label"): fix_labels <- function(x, TextIfMissing) { val <- attr(x, "label") if (is.null(val)) TextIfMissing else val } longlabels <- sapply(spss1, fix_labels, TextIfMissing = "NO LABLE IN SPSS") This function is supposed to create a vector of long labels and usually it does, e.g.: str(longlabels) Named chr [1:64] "Serial number" ... - attr(*, "names")= chr [1:64] "Respondent_Serial" "weight" "r7_1" "r7_2" ... However, I just got an SPSS file with 92 columns and ran exactly the same function on it. Now, I am getting not a vector, but a list str(longlabels) List of 92 $ VEHRATED : chr "VEHICLE RATED" $ RESPID : chr "RESPONDENT ID" $ RESPID8 : chr "8 DIGIT RESPONDENT NUMBER" An observation about the structure of longlabels here: those columns that do NOT have a long lable in SPSS but DO have Values (value labels) - for them my function grabs their value labels, so that now my long label is recorded as a numeric vector with names, e.g.: $ AWARE2 : Named num [1:2] 1 2 ..- attr(*, "names")= chr [1:2] "VERY/SOMEWHAT FAMILIAR" "NOT AT ALL FAMILIAR" Question: How could I avoid the extraction of the Value Labels for the columns that have no long labels? Thank you very much! -- Dimitri Liakhovitski
Dimitri Liakhovitski
2015-Nov-12 22:55 UTC
[R] "haven" - read_spss: How to avoid extracting value labels instead of long labels?
Looks like a little bug in 'haven': When I actually look at the attributes of one variable that has no long label in SPSS but has Value Labels, I am getting: attr(spss1$WAVE, "label") NULL But when I sapply my function longlabels to my data frame and ask it to print the long labels for each column, for the same column "WAVE" I am getting - instead of NULL: NULL VERY/SOMEWHAT FAMILIAR NOT AT ALL FAMILIAR 1 2 This is, of course, incorrect, because it grabs the next attribute (which one? And replaces NULL with it). Any suggestions? Thanks! On Thu, Nov 12, 2015 at 11:56 AM, Dimitri Liakhovitski <dimitri.liakhovitski at gmail.com> wrote:> Hello! > > I don't have an example file, but I think my question should be clear > without it. > I have an SPSS file. I read it in using 'haven': > > library(haven) > spss1 <- read_spss("SPSS_Example.sav") > > I created a function that extracts the long labels (in SPSS - "Label"): > > fix_labels <- function(x, TextIfMissing) { > val <- attr(x, "label") > if (is.null(val)) TextIfMissing else val > } > longlabels <- sapply(spss1, fix_labels, TextIfMissing = "NO LABLE IN SPSS") > > This function is supposed to create a vector of long labels and > usually it does, e.g.: > > str(longlabels) > Named chr [1:64] "Serial number" ... > - attr(*, "names")= chr [1:64] "Respondent_Serial" "weight" "r7_1" "r7_2" ... > > However, I just got an SPSS file with 92 columns and ran exactly the > same function on it. Now, I am getting not a vector, but a list > > str(longlabels) > List of 92 > $ VEHRATED : chr "VEHICLE RATED" > $ RESPID : chr "RESPONDENT ID" > $ RESPID8 : chr "8 DIGIT RESPONDENT NUMBER" > > An observation about the structure of longlabels here: those columns > that do NOT have a long lable in SPSS but DO have Values (value > labels) - for them my function grabs their value labels, so that now > my long label is recorded as a numeric vector with names, e.g.: > > $ AWARE2 : Named num [1:2] 1 2 > ..- attr(*, "names")= chr [1:2] "VERY/SOMEWHAT FAMILIAR" "NOT AT ALL FAMILIAR" > > Question: How could I avoid the extraction of the Value Labels for the > columns that have no long labels? > > Thank you very much! > -- > Dimitri Liakhovitski-- Dimitri Liakhovitski
Dimitri Liakhovitski
2015-Nov-13 01:37 UTC
[R] "haven" - read_spss: How to avoid extracting value labels instead of long labels?
I have to rephrase my question again - it's clearly a small bug in haven. Here is what it is about: If I have a column in SPSS that has BOTH a long label and value labels, then everything works fine - I access one with 'label' and another with 'labels': attr(spss1$MYVAR, "label") [1] "LONG LABEL" attr(spss1$MYVAR, "labels") DEFINITELY CONSIDER PROBABLY CONSIDER PROBABLY NOT CONSIDER DEFINITELY NOT CONSIDER 1 2 3 4 However, if I have a column that has no long label and ONLY value labels, then it's not working properly:> attr(spss1$MYVAR, "label")VERY/SOMEWHAT FAMILIAR NOT AT ALL FAMILIAR 1 2> attr(spss1$MYVAR, "labels")VERY/SOMEWHAT FAMILIAR NOT AT ALL FAMILIAR 1 2 And I actually need to be able to identify if label is empty. Thank you for looking into it! Dimitri On Thu, Nov 12, 2015 at 5:55 PM, Dimitri Liakhovitski <dimitri.liakhovitski at gmail.com> wrote:> Looks like a little bug in 'haven': > > When I actually look at the attributes of one variable that has no > long label in SPSS but has Value Labels, I am getting: > attr(spss1$WAVE, "label") > NULL > > But when I sapply my function longlabels to my data frame and ask it > to print the long labels for each column, for the same column "WAVE" I > am getting - instead of NULL: > NULL > VERY/SOMEWHAT FAMILIAR NOT AT ALL FAMILIAR > 1 2 > > This is, of course, incorrect, because it grabs the next attribute > (which one? And replaces NULL with it). > Any suggestions? > Thanks! > > > > > On Thu, Nov 12, 2015 at 11:56 AM, Dimitri Liakhovitski > <dimitri.liakhovitski at gmail.com> wrote: >> Hello! >> >> I don't have an example file, but I think my question should be clear >> without it. >> I have an SPSS file. I read it in using 'haven': >> >> library(haven) >> spss1 <- read_spss("SPSS_Example.sav") >> >> I created a function that extracts the long labels (in SPSS - "Label"): >> >> fix_labels <- function(x, TextIfMissing) { >> val <- attr(x, "label") >> if (is.null(val)) TextIfMissing else val >> } >> longlabels <- sapply(spss1, fix_labels, TextIfMissing = "NO LABLE IN SPSS") >> >> This function is supposed to create a vector of long labels and >> usually it does, e.g.: >> >> str(longlabels) >> Named chr [1:64] "Serial number" ... >> - attr(*, "names")= chr [1:64] "Respondent_Serial" "weight" "r7_1" "r7_2" ... >> >> However, I just got an SPSS file with 92 columns and ran exactly the >> same function on it. Now, I am getting not a vector, but a list >> >> str(longlabels) >> List of 92 >> $ VEHRATED : chr "VEHICLE RATED" >> $ RESPID : chr "RESPONDENT ID" >> $ RESPID8 : chr "8 DIGIT RESPONDENT NUMBER" >> >> An observation about the structure of longlabels here: those columns >> that do NOT have a long lable in SPSS but DO have Values (value >> labels) - for them my function grabs their value labels, so that now >> my long label is recorded as a numeric vector with names, e.g.: >> >> $ AWARE2 : Named num [1:2] 1 2 >> ..- attr(*, "names")= chr [1:2] "VERY/SOMEWHAT FAMILIAR" "NOT AT ALL FAMILIAR" >> >> Question: How could I avoid the extraction of the Value Labels for the >> columns that have no long labels? >> >> Thank you very much! >> -- >> Dimitri Liakhovitski > > > > -- > Dimitri Liakhovitski-- Dimitri Liakhovitski