Dear Sir/Madam, Hope everyone is safe and sound. I appreciate your help a lot. I am evaluating two Arabic subtitles of a humorous English scene and asked 263 participants (part) to evaluate the two subtitles (named Standard Arabic, SA, and Egyptian Arabic, EA) via a questionnaire that asked them to rank the two subtitles in terms of how much each subtitle is 2) more humorous (hum), 5) closer to Egyptian culture (cul) The questionnaire contained two 1-10 linear scale questions regarding the 2 points clarified, with 1 meaning the most humorous and closest to Egyptian culture, and 1 meaning the least humorous and furthest from Egyptian culture. Also, the questionnaire had a general multiple-choice question regarding which subtitle is better in general (better). General information about the participants were also collected concerning gender (categorical factor), age (numeric factor) and education (categorical factor). Two versions of the questionnaire were relied on: one showing the ?SA subtitle first? and another showing the ?EA subtitle first?. Nearly half the participants answered the first and nearly half answered the latter. I am focusing on which social factor/s lead/s the participants to evaluate one of the two subtitles as generally better and which subtitle is more humorous and closer to Egyptian culture. Each of these points alone can be the dependent factor, but the results altogether can be linked. I thought that mixed effects analyses would clarify the picture and answer the research questions (which factor/s lead/s participants to favour a subtitle over another?) and, so, tried the lme4 package in R and ran many models but all the codes I have used are not working. I ran the following codes, which yielded Error messages, like: model1<- lmer (better ~ gender + age + education + WF + (1 | part), data=sub_data) Error: number of levels of each grouping factor must be < number of observations (problems: part) Model2 <- glmer (better ~ gender + age + education + WF + (1 | part), data = sub_data, family='binomial') Error in mkRespMod(fr, family = family) : response must be numeric or factor Model3 <- glmer (better ~ age + gender + education + WF + (1 | part), data = sub_data, family='binomial', control=glmerControl(optimizer=c("bobyqa"))) Error in mkRespMod(fr, family = family) : response must be numeric or factor Why does the model crash? Does the problem lie in the random factor part (which is a code for participants)? Or is it something related to the mixed effects analysis? Best Saudi Sadiq [[alternative HTML version deleted]]
Hi Saudi, I can only make a guess, but that is that a variable having a unique value for each participant has been read in as a factor. I assume that "better" is some combination of "hum" and "cul" and exactly what is WF? Jim On Thu, Jun 11, 2020 at 5:27 AM Saudi Sadiq <saudisadiq at gmail.com> wrote:> > Dear Sir/Madam, > > Hope everyone is safe and sound. I appreciate your help a lot. > > I am evaluating two Arabic subtitles of a humorous English scene and asked > 263 participants (part) to evaluate the two subtitles (named Standard > Arabic, SA, and Egyptian Arabic, EA) via a questionnaire that asked them to > rank the two subtitles in terms of how much each subtitle is > > 2) more humorous (hum), > > 5) closer to Egyptian culture (cul) > > > > The questionnaire contained two 1-10 linear scale questions regarding the 2 > points clarified, with 1 meaning the most humorous and closest to Egyptian > culture, and 1 meaning the least humorous and furthest from Egyptian > culture. Also, the questionnaire had a general multiple-choice question > regarding which subtitle is better in general (better). General information > about the participants were also collected concerning gender (categorical > factor), age (numeric factor) and education (categorical factor). > > Two versions of the questionnaire were relied on: one showing the ?SA > subtitle first? and another showing the ?EA subtitle first?. Nearly half > the participants answered the first and nearly half answered the latter. > > I am focusing on which social factor/s lead/s the participants to evaluate > one of the two subtitles as generally better and which subtitle is more > humorous and closer to Egyptian culture. Each of these points alone can be > the dependent factor, but the results altogether can be linked. > > I thought that mixed effects analyses would clarify the picture and answer > the research questions (which factor/s lead/s participants to favour a > subtitle over another?) and, so, tried the lme4 package in R and ran many > models but all the codes I have used are not working. > > I ran the following codes, which yielded Error messages, like: > > model1<- lmer (better ~ gender + age + education + WF + (1 | part), > data=sub_data) > > Error: number of levels of each grouping factor must be < number of > observations (problems: part) > > > > Model2 <- glmer (better ~ gender + age + education + WF + (1 | part), data > = sub_data, family='binomial') > > Error in mkRespMod(fr, family = family) : > > response must be numeric or factor > > > > Model3 <- glmer (better ~ age + gender + education + WF + (1 | part), data > = sub_data, family='binomial', control=glmerControl(optimizer=c("bobyqa"))) > > Error in mkRespMod(fr, family = family) : > > response must be numeric or factor > > > > Why does the model crash? Does the problem lie in the random factor part (which > is a code for participants)? Or is it something related to the mixed > effects analysis? > > Best > Saudi Sadiq > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Jim, So many thanks for your reply. I actually made a mistake in presenting the problem; I should have clarified that the 1-10 linear scale questions went as: 10 most humorous/closest to Egyptian culture and 1 the least. Also, I should have attached some examples so the participant issue could be clear. Here is attached the dataset (if there is no problem or I am not going against the rules of the R-help group). Actually, I wanted better to be the only dependent factor and asking participants 'which subtitle is better?' could be enough, but I wanted to have detailed information of why a subtitle is better by asking participants specific questions (regarding which subtitle is more humorous and closer to Egyptian culture). Most of the time, the total of the hum + cul = better, but sometimes it is not (e.g. the sum for subtitle EA could be bigger than for SA, but the participant prefers SA in the better column). The WF (*watched first*) is the mode via which participants watched the two subtitles; some participants watched the SA subtitle first and other watched the EA first. Does this make sense? All the best On Thu, 11 Jun 2020 at 05:24, Jim Lemon <drjimlemon at gmail.com> wrote:> Hi Saudi, > I can only make a guess, but that is that a variable having a unique > value for each participant has been read in as a factor. I assume that > "better" is some combination of "hum" and "cul" and exactly what is > WF? > > Jim > > On Thu, Jun 11, 2020 at 5:27 AM Saudi Sadiq <saudisadiq at gmail.com> wrote: > > > > Dear Sir/Madam, > > > > Hope everyone is safe and sound. I appreciate your help a lot. > > > > I am evaluating two Arabic subtitles of a humorous English scene and > asked > > 263 participants (part) to evaluate the two subtitles (named Standard > > Arabic, SA, and Egyptian Arabic, EA) via a questionnaire that asked them > to > > rank the two subtitles in terms of how much each subtitle is > > > > 2) more humorous (hum), > > > > 5) closer to Egyptian culture (cul) > > > > > > > > The questionnaire contained two 1-10 linear scale questions regarding > the 2 > > points clarified, with 1 meaning the most humorous and closest to > Egyptian > > culture, and 1 meaning the least humorous and furthest from Egyptian > > culture. Also, the questionnaire had a general multiple-choice question > > regarding which subtitle is better in general (better). General > information > > about the participants were also collected concerning gender (categorical > > factor), age (numeric factor) and education (categorical factor). > > > > Two versions of the questionnaire were relied on: one showing the ?SA > > subtitle first? and another showing the ?EA subtitle first?. Nearly half > > the participants answered the first and nearly half answered the latter. > > > > I am focusing on which social factor/s lead/s the participants to > evaluate > > one of the two subtitles as generally better and which subtitle is more > > humorous and closer to Egyptian culture. Each of these points alone can > be > > the dependent factor, but the results altogether can be linked. > > > > I thought that mixed effects analyses would clarify the picture and > answer > > the research questions (which factor/s lead/s participants to favour a > > subtitle over another?) and, so, tried the lme4 package in R and ran > many > > models but all the codes I have used are not working. > > > > I ran the following codes, which yielded Error messages, like: > > > > model1<- lmer (better ~ gender + age + education + WF + (1 | part), > > data=sub_data) > > > > Error: number of levels of each grouping factor must be < number of > > observations (problems: part) > > > > > > > > Model2 <- glmer (better ~ gender + age + education + WF + (1 | part), > data > > = sub_data, family='binomial') > > > > Error in mkRespMod(fr, family = family) : > > > > response must be numeric or factor > > > > > > > > Model3 <- glmer (better ~ age + gender + education + WF + (1 | part), > data > > = sub_data, family='binomial', > control=glmerControl(optimizer=c("bobyqa"))) > > > > Error in mkRespMod(fr, family = family) : > > > > response must be numeric or factor > > > > > > > > Why does the model crash? Does the problem lie in the random factor part > (which > > is a code for participants)? Or is it something related to the mixed > > effects analysis? > > > > Best > > Saudi Sadiq > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >-- Saudi Sadiq, Lecturer, Minia University, Egypt Academia <http://york.academia.edu/SaudiSadiq>, Reserachgate <https://www.researchgate.net/profile/Saudi_Sadiq>, Google Scholar <https://scholar.google.co.uk/citations?user=h0latzcAAAAJ&hl=en>, Publons <https://publons.com/researcher/2950905/saudi-sadiq/> Certified Translator by (Egyta) <https://www.egyta.com/> Associate Fellow of the Higher Education Academy, UK <https://www.heacademy.ac.uk/>
Hi Saudi, Apologies for the delay. (also returning to the list) In your initial code: model1<- lmer (better ~ gender + age + education + WF + (1 | part),> data=sub_data)you have age as a fixed effect and there are also 36 levels. This probably causing the error you describe above and I have changed it to a random factor. Your response variable is "better", which has the same levels as WF and is not numeric. This looks like a mistake. I have written four models with the "hum" and "cul" variables as response variables. This looks more sensible to me. The levels of the "education" variable are not ordered correctly. The following code runs okay, but there is a singular fit for EA_cul. The effects seem to be of education, except for the EA_cul model. The following may get you started: sub_data<-read.csv("sub_data.csv",stringsAsFactors=FALSE) # get the education factor into the correct order sub_data$education<-factor(sub_data$education, levels=c("seconadry or below","university","postgrad")) library(lme4) modelSA_hum<-lmer(SA_hum~gender+education+WF+(1|age),data=sub_data) modelSA_cul<-lmer(SA_cul~gender+education+WF+(1|age),data=sub_data) modelEA_hum<-lmer(EA_hum~gender+education+WF+(1|age),data=sub_data) modelEA_cul<-lmer(EA_cul~gender+education+WF+(1|age),data=sub_data) summary(modelSA_hum) summary(modelSA_cul) summary(modelEA_hum) summary(modelEA_cul) # look at the distribution of responses table(sub_data$SA_hum) table(sub_data$SA_cul) table(sub_data$EA_hum) table(sub_data$EA_cul) Jim On Sun, Jun 14, 2020 at 9:42 AM Saudi Sadiq <saudisadiq at gmail.com> wrote:> > Hi Jim, > Hope you are safe and sound. > So sorry to bother you again. I am still waiting for your reply after I have attached the dataset. > I know you are very busy, but I will appreciate it a lot if you can guide me in how to make the g(lmer) mkdel work, or guide me to something different. > All the best > > ---------- Forwarded message --------- > From: Saudi Sadiq <saudisadiq at gmail.com> > Date: Fri, 12 Jun 2020, 4:18 pm > Subject: Re: [R] Help with a (g)lmer code > To: Jim Lemon <drjimlemon at gmail.com> > Cc: r-help mailing list <r-help at r-project.org> > > > Hi Jim, > > So many thanks for your reply. I actually made a mistake in presenting the problem; I should have clarified that the 1-10 linear scale questions went as: 10 most humorous/closest to Egyptian culture and 1 the least. Also, I should have attached some examples so the participant issue could be clear. Here is attached the dataset (if there is no problem or I am not going against the rules of the R-help group). > > Actually, I wanted better to be the only dependent factor and asking participants 'which subtitle is better?' could be enough, but I wanted to have detailed information of why a subtitle is better by asking participants specific questions (regarding which subtitle is more humorous and closer to Egyptian culture). Most of the time, the total of the hum + cul = better, but sometimes it is not (e.g. the sum for subtitle EA could be bigger than for SA, but the participant prefers SA in the better column). > > The WF (watched first) is the mode via which participants watched the two subtitles; some participants watched the SA subtitle first and other watched the EA first. > > Does this make sense? > > All the best > > > On Thu, 11 Jun 2020 at 05:24, Jim Lemon <drjimlemon at gmail.com> wrote: >> >> Hi Saudi, >> I can only make a guess, but that is that a variable having a unique >> value for each participant has been read in as a factor. I assume that >> "better" is some combination of "hum" and "cul" and exactly what is >> WF? >> >> Jim >> >> On Thu, Jun 11, 2020 at 5:27 AM Saudi Sadiq <saudisadiq at gmail.com> wrote: >> > >> > Dear Sir/Madam, >> > >> > Hope everyone is safe and sound. I appreciate your help a lot. >> > >> > I am evaluating two Arabic subtitles of a humorous English scene and asked >> > 263 participants (part) to evaluate the two subtitles (named Standard >> > Arabic, SA, and Egyptian Arabic, EA) via a questionnaire that asked them to >> > rank the two subtitles in terms of how much each subtitle is >> > >> > 2) more humorous (hum), >> > >> > 5) closer to Egyptian culture (cul) >> > >> > >> > >> > The questionnaire contained two 1-10 linear scale questions regarding the 2 >> > points clarified, with 1 meaning the most humorous and closest to Egyptian >> > culture, and 1 meaning the least humorous and furthest from Egyptian >> > culture. Also, the questionnaire had a general multiple-choice question >> > regarding which subtitle is better in general (better). General information >> > about the participants were also collected concerning gender (categorical >> > factor), age (numeric factor) and education (categorical factor). >> > >> > Two versions of the questionnaire were relied on: one showing the ?SA >> > subtitle first? and another showing the ?EA subtitle first?. Nearly half >> > the participants answered the first and nearly half answered the latter. >> > >> > I am focusing on which social factor/s lead/s the participants to evaluate >> > one of the two subtitles as generally better and which subtitle is more >> > humorous and closer to Egyptian culture. Each of these points alone can be >> > the dependent factor, but the results altogether can be linked. >> > >> > I thought that mixed effects analyses would clarify the picture and answer >> > the research questions (which factor/s lead/s participants to favour a >> > subtitle over another?) and, so, tried the lme4 package in R and ran many >> > models but all the codes I have used are not working. >> > >> > I ran the following codes, which yielded Error messages, like: >> > >> > model1<- lmer (better ~ gender + age + education + WF + (1 | part), >> > data=sub_data) >> > >> > Error: number of levels of each grouping factor must be < number of >> > observations (problems: part) >> > >> > >> > >> > Model2 <- glmer (better ~ gender + age + education + WF + (1 | part), data >> > = sub_data, family='binomial') >> > >> > Error in mkRespMod(fr, family = family) : >> > >> > response must be numeric or factor >> > >> > >> > >> > Model3 <- glmer (better ~ age + gender + education + WF + (1 | part), data >> > = sub_data, family='binomial', control=glmerControl(optimizer=c("bobyqa"))) >> > >> > Error in mkRespMod(fr, family = family) : >> > >> > response must be numeric or factor >> > >> > >> > >> > Why does the model crash? Does the problem lie in the random factor part (which >> > is a code for participants)? Or is it something related to the mixed >> > effects analysis? >> > >> > Best >> > Saudi Sadiq >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. > > > > -- > Saudi Sadiq, > > Lecturer, Minia University, Egypt > > Academia, Reserachgate, Google Scholar, Publons > > Certified Translator by (Egyta) > > Associate Fellow of the Higher Education Academy, UK