Hello, I am just transitioning from SPSS to R. I used the haven library to import some of my spss data files to R. However, when I run procedures such as frequencies or crosstabs, value labels for categorical variables such as gender (1=male, 2=female) are not shown. The same applies to many other output. I am confused. 1. Is there a global setting that I can use to force all categorical variables to display labels? 2. Or, are these labels to be set for each function or package? 3. How can I request the value labels for each function I run? Thanks in advance for your help.. Best, Yawo [[alternative HTML version deleted]]
What does your data look like after importing? -- see ?head and ?str to tell us. Show us the code that failed to provide "labels." See the posting guide below for how to post questions that are likely to elicit helpful responses. I know nothing about the haven package, but see ?factor or go through an R tutorial or two to learn about factors, which may be part of the issue here. R *generally* obtains whatever "label" info it needs from the object being tabled -- see ?tabulate, ?table etc. -- if that's what you're doing. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, Feb 7, 2020 at 8:28 AM Yawo Kokuvi <yawo1964 at gmail.com> wrote:> Hello, > > I am just transitioning from SPSS to R. > > I used the haven library to import some of my spss data files to R. > > However, when I run procedures such as frequencies or crosstabs, value > labels for categorical variables such as gender (1=male, 2=female) are not > shown. The same applies to many other output. > > I am confused. > > 1. Is there a global setting that I can use to force all categorical > variables to display labels? > > 2. Or, are these labels to be set for each function or package? > > 3. How can I request the value labels for each function I run? > > Thanks in advance for your help.. > > Best, Yawo > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Thanks for all your assistance Attached please is the Rdata scratch I have been using -----------------------------------------------------> head(Scratch, n=13)# A tibble: 13 x 6 ID marital sex race paeduc speduc <dbl> <dbl+lbl> <dbl+lbl> <dbl+lbl> <dbl+lbl> <dbl+lbl> 1 1 3 [DIVORCED] 1 [MALE] 1 [WHITE] NA NA 2 2 1 [MARRIED] 1 [MALE] 1 [WHITE] NA NA 3 3 3 [DIVORCED] 1 [MALE] 1 [WHITE] 4 NA 4 4 4 [SEPARATED] 1 [MALE] 1 [WHITE] 16 NA 5 5 3 [DIVORCED] 1 [MALE] 1 [WHITE] 18 NA 6 6 1 [MARRIED] 2 [FEMALE] 1 [WHITE] 14 20 7 7 1 [MARRIED] 2 [FEMALE] 2 [BLACK] NA 12 8 8 1 [MARRIED] 2 [FEMALE] 1 [WHITE] NA 12 9 9 3 [DIVORCED] 2 [FEMALE] 1 [WHITE] 11 NA 10 10 1 [MARRIED] 2 [FEMALE] 1 [WHITE] 16 12 11 11 5 [NEVER MARRIED] 2 [FEMALE] 2 [BLACK] NA NA 12 12 3 [DIVORCED] 2 [FEMALE] 2 [BLACK] NA NA 13 13 3 [DIVORCED] 2 [FEMALE] 2 [BLACK] 16 NA ----------------------------------------------------- and below is my script/command file. *#1: Load library and import SPSS dataset* library(haven) Scratch <- read_sav("~/Desktop/Scratch.sav") *#2: save the dataset with a name* save(ScratchImport, file="Scratch.Rdata") *#3: install & load necessary packages for descriptive statistics* install.packages ("freqdist") library (freqdist) install.packages ("sjlabelled") library (sjlabelled) install.packages ("labelled") library (labelled) install.packages ("surveytoolbox") library (surveytoolbox) *#4: Check the value labels of gender and marital status* Scratch$sex %>% attr('labels') Scratch$marital %>% attr('labels') *#5: Frequency Distribution and BarChart for Categorical/Ordinal Level Variables such as Gender - SEX* freqdist(Scratch$sex) barplot(table(Scratch$marital)) ----------------------------------------------------- As you can see from above, I use the <haven> package to import the data from SPSS. Apparently, the haven function keeps the value labels, as the attribute options in section #4 of my script shows. The problem is that when I run frequency distribution for any of the categorical variables like sex or marital status, only the numbers (1, 2,) are displayed in the output. The labels (male, female) for example are not. Is there any way to force these to be shown in the output? Is there a global property that I have to set so that these value labels are reliably displayed with every output? I read I can declare them as factors using the <as_factor()>, but once I do so, how do I invoke them in my commands so that the value labels show... Sorry about all the noobs questions, but Ihopefully, I am able to get this working. Thanks in advance. Thanks - cY On Fri, Feb 7, 2020 at 1:14 PM <cpolwart at chemo.org.uk> wrote:> I've never used it, but there is a labels function in haven... > > On 7 Feb 2020 17:05, Bert Gunter <bgunter.4567 at gmail.com> wrote: > > What does your data look like after importing? -- see ?head and ?str to > tell us. Show us the code that failed to provide "labels." See the posting > guide below for how to post questions that are likely to elicit helpful > responses. > > I know nothing about the haven package, but see ?factor or go through an R > tutorial or two to learn about factors, which may be part of the issue > here. R *generally* obtains whatever "label" info it needs from the object > being tabled -- see ?tabulate, ?table etc. -- if that's what you're doing. > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Fri, Feb 7, 2020 at 8:28 AM Yawo Kokuvi <yawo1964 at gmail.com> wrote: > > > Hello, > > > > I am just transitioning from SPSS to R. > > > > I used the haven library to import some of my spss data files to R. > > > > However, when I run procedures such as frequencies or crosstabs, value > > labels for categorical variables such as gender (1=male, 2=female) are > not > > shown. The same applies to many other output. > > > > I am confused. > > > > 1. Is there a global setting that I can use to force all categorical > > variables to display labels? > > > > 2. Or, are these labels to be set for each function or package? > > > > 3. How can I request the value labels for each function I run? > > > > Thanks in advance for your help.. > > > > Best, Yawo > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > >[[alternative HTML version deleted]]
Hi Yawo,>From your recent post, you say you have coerced the variables tofactors. If so, perhaps: as.character(x) is what you want. If not, creating a new variable like this: Scratch$new_race<-factor(as.character(Scratch$race),levels=c("WHITE","BLACK")) may do it. Note the "levels" argument to get the numeric values in the same order as the original. Jim On Sat, Feb 8, 2020 at 7:32 AM Yawo Kokuvi <yawo1964 at gmail.com> wrote:> > Thanks for all your assistance > > Attached please is the Rdata scratch I have been using > > ----------------------------------------------------- > > > head(Scratch, n=13) > # A tibble: 13 x 6 > ID marital sex race paeduc speduc > <dbl> <dbl+lbl> <dbl+lbl> <dbl+lbl> <dbl+lbl> <dbl+lbl> > 1 1 3 [DIVORCED] 1 [MALE] 1 [WHITE] NA NA > 2 2 1 [MARRIED] 1 [MALE] 1 [WHITE] NA NA > 3 3 3 [DIVORCED] 1 [MALE] 1 [WHITE] 4 NA > 4 4 4 [SEPARATED] 1 [MALE] 1 [WHITE] 16 NA > 5 5 3 [DIVORCED] 1 [MALE] 1 [WHITE] 18 NA > 6 6 1 [MARRIED] 2 [FEMALE] 1 [WHITE] 14 20 > 7 7 1 [MARRIED] 2 [FEMALE] 2 [BLACK] NA 12 > 8 8 1 [MARRIED] 2 [FEMALE] 1 [WHITE] NA 12 > 9 9 3 [DIVORCED] 2 [FEMALE] 1 [WHITE] 11 NA > 10 10 1 [MARRIED] 2 [FEMALE] 1 [WHITE] 16 12 > 11 11 5 [NEVER MARRIED] 2 [FEMALE] 2 [BLACK] NA NA > 12 12 3 [DIVORCED] 2 [FEMALE] 2 [BLACK] NA NA > 13 13 3 [DIVORCED] 2 [FEMALE] 2 [BLACK] 16 NA > > ----------------------------------------------------- > > and below is my script/command file. > > *#1: Load library and import SPSS dataset* > library(haven) > Scratch <- read_sav("~/Desktop/Scratch.sav") > > *#2: save the dataset with a name* > save(ScratchImport, file="Scratch.Rdata") > > *#3: install & load necessary packages for descriptive statistics* > install.packages ("freqdist") > library (freqdist) > > install.packages ("sjlabelled") > library (sjlabelled) > > install.packages ("labelled") > library (labelled) > > install.packages ("surveytoolbox") > library (surveytoolbox) > > *#4: Check the value labels of gender and marital status* > Scratch$sex %>% attr('labels') > Scratch$marital %>% attr('labels') > > *#5: Frequency Distribution and BarChart for Categorical/Ordinal Level > Variables such as Gender - SEX* > freqdist(Scratch$sex) > barplot(table(Scratch$marital)) > > ----------------------------------------------------- > > As you can see from above, I use the <haven> package to import the data > from SPSS. Apparently, the haven function keeps the value labels, as the > attribute options in section #4 of my script shows. > The problem is that when I run frequency distribution for any of the > categorical variables like sex or marital status, only the numbers (1, 2,) > are displayed in the output. The labels (male, female) for example are not. > > Is there any way to force these to be shown in the output? Is there a > global property that I have to set so that these value labels are reliably > displayed with every output? I read I can declare them as factors using > the <as_factor()>, but once I do so, how do I invoke them in my commands so > that the value labels show... > > Sorry about all the noobs questions, but Ihopefully, I am able to get this > working. > > Thanks in advance. > > > Thanks - cY > > > On Fri, Feb 7, 2020 at 1:14 PM <cpolwart at chemo.org.uk> wrote: > > > I've never used it, but there is a labels function in haven... > > > > On 7 Feb 2020 17:05, Bert Gunter <bgunter.4567 at gmail.com> wrote: > > > > What does your data look like after importing? -- see ?head and ?str to > > tell us. Show us the code that failed to provide "labels." See the posting > > guide below for how to post questions that are likely to elicit helpful > > responses. > > > > I know nothing about the haven package, but see ?factor or go through an R > > tutorial or two to learn about factors, which may be part of the issue > > here. R *generally* obtains whatever "label" info it needs from the object > > being tabled -- see ?tabulate, ?table etc. -- if that's what you're doing. > > > > Bert Gunter > > > > "The trouble with having an open mind is that people keep coming along and > > sticking things into it." > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > On Fri, Feb 7, 2020 at 8:28 AM Yawo Kokuvi <yawo1964 at gmail.com> wrote: > > > > > Hello, > > > > > > I am just transitioning from SPSS to R. > > > > > > I used the haven library to import some of my spss data files to R. > > > > > > However, when I run procedures such as frequencies or crosstabs, value > > > labels for categorical variables such as gender (1=male, 2=female) are > > not > > > shown. The same applies to many other output. > > > > > > I am confused. > > > > > > 1. Is there a global setting that I can use to force all categorical > > > variables to display labels? > > > > > > 2. Or, are these labels to be set for each function or package? > > > > > > 3. How can I request the value labels for each function I run? > > > > > > Thanks in advance for your help.. > > > > > > Best, Yawo > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi, Could you upload some sample data in dput form? Something like dput(head(Scratch, n=13)) will give us some real data to examine. Just copy and paste the output of dput(head(Scratch, n=13))into the email. This is the best way to ensure that R-help denizens are getting the data in the exact format that you have. On Fri, 7 Feb 2020 at 15:32, Yawo Kokuvi <yawo1964 at gmail.com> wrote:> Thanks for all your assistance > > Attached please is the Rdata scratch I have been using > > ----------------------------------------------------- > > > head(Scratch, n=13) > # A tibble: 13 x 6 > ID marital sex race paeduc speduc > <dbl> <dbl+lbl> <dbl+lbl> <dbl+lbl> <dbl+lbl> <dbl+lbl> > 1 1 3 [DIVORCED] 1 [MALE] 1 [WHITE] NA NA > 2 2 1 [MARRIED] 1 [MALE] 1 [WHITE] NA NA > 3 3 3 [DIVORCED] 1 [MALE] 1 [WHITE] 4 NA > 4 4 4 [SEPARATED] 1 [MALE] 1 [WHITE] 16 NA > 5 5 3 [DIVORCED] 1 [MALE] 1 [WHITE] 18 NA > 6 6 1 [MARRIED] 2 [FEMALE] 1 [WHITE] 14 20 > 7 7 1 [MARRIED] 2 [FEMALE] 2 [BLACK] NA 12 > 8 8 1 [MARRIED] 2 [FEMALE] 1 [WHITE] NA 12 > 9 9 3 [DIVORCED] 2 [FEMALE] 1 [WHITE] 11 NA > 10 10 1 [MARRIED] 2 [FEMALE] 1 [WHITE] 16 12 > 11 11 5 [NEVER MARRIED] 2 [FEMALE] 2 [BLACK] NA NA > 12 12 3 [DIVORCED] 2 [FEMALE] 2 [BLACK] NA NA > 13 13 3 [DIVORCED] 2 [FEMALE] 2 [BLACK] 16 NA > > ----------------------------------------------------- > > and below is my script/command file. > > *#1: Load library and import SPSS dataset* > library(haven) > Scratch <- read_sav("~/Desktop/Scratch.sav") > > *#2: save the dataset with a name* > save(ScratchImport, file="Scratch.Rdata") > > *#3: install & load necessary packages for descriptive statistics* > install.packages ("freqdist") > library (freqdist) > > install.packages ("sjlabelled") > library (sjlabelled) > > install.packages ("labelled") > library (labelled) > > install.packages ("surveytoolbox") > library (surveytoolbox) > > *#4: Check the value labels of gender and marital status* > Scratch$sex %>% attr('labels') > Scratch$marital %>% attr('labels') > > *#5: Frequency Distribution and BarChart for Categorical/Ordinal Level > Variables such as Gender - SEX* > freqdist(Scratch$sex) > barplot(table(Scratch$marital)) > > ----------------------------------------------------- > > As you can see from above, I use the <haven> package to import the data > from SPSS. Apparently, the haven function keeps the value labels, as the > attribute options in section #4 of my script shows. > The problem is that when I run frequency distribution for any of the > categorical variables like sex or marital status, only the numbers (1, 2,) > are displayed in the output. The labels (male, female) for example are > not. > > Is there any way to force these to be shown in the output? Is there a > global property that I have to set so that these value labels are reliably > displayed with every output? I read I can declare them as factors using > the <as_factor()>, but once I do so, how do I invoke them in my commands so > that the value labels show... > > Sorry about all the noobs questions, but Ihopefully, I am able to get this > working. > > Thanks in advance. > > > Thanks - cY > > > On Fri, Feb 7, 2020 at 1:14 PM <cpolwart at chemo.org.uk> wrote: > > > I've never used it, but there is a labels function in haven... > > > > On 7 Feb 2020 17:05, Bert Gunter <bgunter.4567 at gmail.com> wrote: > > > > What does your data look like after importing? -- see ?head and ?str to > > tell us. Show us the code that failed to provide "labels." See the > posting > > guide below for how to post questions that are likely to elicit helpful > > responses. > > > > I know nothing about the haven package, but see ?factor or go through an > R > > tutorial or two to learn about factors, which may be part of the issue > > here. R *generally* obtains whatever "label" info it needs from the > object > > being tabled -- see ?tabulate, ?table etc. -- if that's what you're > doing. > > > > Bert Gunter > > > > "The trouble with having an open mind is that people keep coming along > and > > sticking things into it." > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > On Fri, Feb 7, 2020 at 8:28 AM Yawo Kokuvi <yawo1964 at gmail.com> wrote: > > > > > Hello, > > > > > > I am just transitioning from SPSS to R. > > > > > > I used the haven library to import some of my spss data files to R. > > > > > > However, when I run procedures such as frequencies or crosstabs, value > > > labels for categorical variables such as gender (1=male, 2=female) are > > not > > > shown. The same applies to many other output. > > > > > > I am confused. > > > > > > 1. Is there a global setting that I can use to force all categorical > > > variables to display labels? > > > > > > 2. Or, are these labels to be set for each function or package? > > > > > > 3. How can I request the value labels for each function I run? > > > > > > Thanks in advance for your help.. > > > > > > Best, Yawo > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- John Kane Kingston ON Canada [[alternative HTML version deleted]]