Dear R-users, I need some assistance. I am running some interactive variables for categorical variables. I have dgen(2 levels converted to dummy variables) and dtoe(4-levels also converted to dummy variables). So I have worked with them in two ways: i created a variable X1 = dgen*dtoe and I get an error "Error in dgen * dtoe : non-conformable arrays"then i run a glm, binomial using that interaction variable and i get : logit_x = glm(samp2$STATUS ~ dgen*dtoe, data=samp2,family = binomial("logit"))> summary(logit_x)Call: glm(formula = samp2$STATUS ~ dgen * dtoe, family = binomial("logit"), data = samp2) Deviance Residuals: Min 1Q Median 3Q Max -2.6594 0.2431 0.2563 0.2563 0.2563 Coefficients: (5 not defined because of singularities) Estimate Std. Error z value Pr(>|z|) (Intercept) 1.857e+01 4.612e+03 0.004 0.997 dgendfemale 1.024e-09 3.766e+03 0.000 1.000 dgendmale NA NA NA NA dtoedpermanent1 -1.517e+01 4.612e+03 -0.003 0.997 dtoedcontract1 -1.511e+01 4.612e+03 -0.003 0.997 dtoedprobation1 2.229e-09 4.982e+03 0.000 1.000 dgendfemale:dtoedpermanent1 1.069e-01 3.766e+03 0.000 1.000 dgendmale:dtoedpermanent1 NA NA NA NA dgendfemale:dtoedcontract1 1.511e+01 3.962e+03 0.004 0.997 dgendmale:dtoedcontract1 NA NA NA NA dgendfemale:dtoedprobation1 NA NA NA NA dgendmale:dtoedprobation1 NA NA NA NA (Dispersion parameter for binomial family taken to be 1) Null deviance: 269.48 on 999 degrees of freedom Residual deviance: 266.56 on 993 degrees of freedom AIC: 280.56 Number of Fisher Scoring iterations: 17 The thing is I need the coefficients, the p-values and t-values of all the variables. In other words i do not want an output of NAs. How can I achieve this? Thanks alot. Taby An idea not coupled with action will never get any bigger than the brain cell it occupied. Arnold Glasgow ...... "Attempt something large enough that failure is guaranteed…unless God steps in!" [[alternative HTML version deleted]]
The reason you get NAs is the rank deficiency. It even says that five coefficients are not defined because of singularities. It is likely the case that certain categories do not exist in the data. Note that in the example below y is ALWAYS zero when x is zero. This makes an interaction inestimable and leads to a singularity as you experience them. The answer to your question then is you cannot get estimates for these coefficients, etc. x<-rep(c(0,1),each=10) y<-c(rep(0,10),rep(c(0,1),each=5)) data.frame(x,y) p<-1/(1+exp(-x-y)) z<-rbinom(20,1,p) reg<-summary(glm(z~y*x,binomial)) HTH, Daniel taby gathoni wrote:> > Dear R-users, > > I need some?? assistance. > > I am running some interactive variables for categorical variables. > > I have dgen(2 levels converted to dummy variables)?? and dtoe(4-levels > also converted to?? dummy variables). So I have worked with them in two > ways: > i created a variable X1 = dgen*dtoe?? and I get an error "Error in dgen * > dtoe : non-conformable arrays"then i run a glm, binomial using that > interaction variable and i get : ???????????????????????????????? logit_x > = glm(samp2$STATUS ~ dgen*dtoe, data=samp2,family = binomial("logit")) >> summary(logit_x) > > Call: > glm(formula = samp2$STATUS ~ dgen * dtoe, family = binomial("logit"), > ?????? data = samp2) > > Deviance Residuals: > ?????? Min???????????? 1Q???? Median???????????? 3Q?????????? Max?? > -2.6594???? 0.2431???? 0.2563???? 0.2563???? 0.2563?? > > Coefficients: (5 not defined because of singularities) > ?????????????????????????????????????????????????????????? Estimate Std. > Error z value Pr(>|z|) > (Intercept)?????????????????????????????????? 1.857e+01?? 4.612e+03???? > 0.004?????? 0.997 > dgendfemale?????????????????????????????????? 1.024e-09?? 3.766e+03???? > 0.000?????? 1.000 > dgendmale???????????????????????????????????????????????????? > NA???????????????? NA?????????? NA???????????? NA > dtoedpermanent1???????????????????????? -1.517e+01?? 4.612e+03?? > -0.003?????? 0.997 > dtoedcontract1?????????????????????????? -1.511e+01?? 4.612e+03?? > -0.003?????? 0.997 > dtoedprobation1?????????????????????????? 2.229e-09?? 4.982e+03???? > 0.000?????? 1.000 > dgendfemale:dtoedpermanent1?? 1.069e-01?? 3.766e+03???? 0.000?????? 1.000 > dgendmale:dtoedpermanent1???????????????????? NA???????????????? > NA?????????? NA???????????? NA > dgendfemale:dtoedcontract1???? 1.511e+01?? 3.962e+03???? 0.004?????? 0.997 > dgendmale:dtoedcontract1?????????????????????? NA???????????????? > NA?????????? NA???????????? NA > dgendfemale:dtoedprobation1???????????????? NA???????????????? > NA?????????? NA???????????? NA > dgendmale:dtoedprobation1???????????????????? NA???????????????? > NA?????????? NA???????????? NA > > (Dispersion parameter for binomial family taken to be 1) > > ?????? Null deviance: 269.48?? on 999?? degrees of freedom > Residual deviance: 266.56?? on 993?? degrees of freedom > AIC: 280.56 > > Number of Fisher Scoring iterations: 17 > > The thing is I need the coefficients, the p-values and t-values of all the > variables. In other words i do not want an output of NAs. How can I > achieve this? > > Thanks alot. > > Taby > > > An idea not coupled with action will never get any bigger than the brain > cell it occupied. > Arnold Glasgow > ...... > "Attempt something large enough that failure is guaranteed???unless God > steps in!" > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- View this message in context: http://r.789695.n4.nabble.com/interaction-between-categorical-variables-tp3613312p3613371.html Sent from the R help mailing list archive at Nabble.com.
On Jun 21, 2011, at 08:39 , taby gathoni wrote:> > Dear R-users, > > I need some assistance. > > I am running some interactive variables for categorical variables. > > I have dgen(2 levels converted to dummy variables) and dtoe(4-levels also converted to dummy variables). So I have worked with them in two ways: > i created a variable X1 = dgen*dtoe and I get an error "Error in dgen * dtoe : non-conformable arrays"then i run a glm, binomial using that interaction variable and i get : logit_x = glm(samp2$STATUS ~ dgen*dtoe, data=samp2,family = binomial("logit")) >> summary(logit_x) > > Call: > glm(formula = samp2$STATUS ~ dgen * dtoe, family = binomial("logit"), > data = samp2) > > Deviance Residuals: > Min 1Q Median 3Q Max > -2.6594 0.2431 0.2563 0.2563 0.2563 > > Coefficients: (5 not defined because of singularities) > Estimate Std. Error z value Pr(>|z|) > (Intercept) 1.857e+01 4.612e+03 0.004 0.997 > dgendfemale 1.024e-09 3.766e+03 0.000 1.000 > dgendmale NA NA NA NA > dtoedpermanent1 -1.517e+01 4.612e+03 -0.003 0.997 > dtoedcontract1 -1.511e+01 4.612e+03 -0.003 0.997 > dtoedprobation1 2.229e-09 4.982e+03 0.000 1.000 > dgendfemale:dtoedpermanent1 1.069e-01 3.766e+03 0.000 1.000 > dgendmale:dtoedpermanent1 NA NA NA NA > dgendfemale:dtoedcontract1 1.511e+01 3.962e+03 0.004 0.997 > dgendmale:dtoedcontract1 NA NA NA NA > dgendfemale:dtoedprobation1 NA NA NA NA > dgendmale:dtoedprobation1 NA NA NA NA > > (Dispersion parameter for binomial family taken to be 1) > > Null deviance: 269.48 on 999 degrees of freedom > Residual deviance: 266.56 on 993 degrees of freedom > AIC: 280.56 > > Number of Fisher Scoring iterations: 17 > > The thing is I need the coefficients, the p-values and t-values of all the variables. In other words i do not want an output of NAs. How can I achieve this?Something is odd here. What do you mean "converted to dummy variables"? Normally you'd use factor variables and let the modelling machinery do the rest. Why do you have two dummies for dgen but only 3 for the four-level dtoe? Notice that you can't have e.g. both a "female" and a "male" dummy when there is an intercept in the model (unless you have a 3rd sex in your data) since the two dummies will sum to 1. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com