Hi, I have some doubt about how qualitative factors are coded in R. For instance, I consider a response y, a quantitative factor x and a qualitative factor m at 3 levels, generated as follow : y_c(6,4,2.3,5,3.5,4,1.,8.5,4.3,5.6,2.3,4.1,2.5,8.4,7.4) x_c(3,1,3,1,2,1,4,5,1,3,4,2,5,4,3) m_gl(3,5) lm(y~x+m) Coefficients: (Intercept) x m2 m3 3.96364 0.09818 0.44145 0.62291 In literature, 2 usual implicit coding process are suggested : m1=0 or m1=-m2-m3. Does R use one of these process ? (I've already read the R documentation on this topic, and it is still not clear in my mind). Furthermore, how can I make prediction using this model, I mean how should I specify the new data (especially for m). Thanks in advance for your help Regards, Isabelle Zabalza-Mezghani -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
There are several ways to model categorical variables, including corner-point, sum-zero, helmert,.... In R you can select them using the options() function, namely for instance options(contrasts = c(unordered = "contr.treatment", ordered "contr.poly")) or directly in the model lm(y~x+C(m, treatment)) or lm(y~x+C(m, sum)) and so on. The interpretation of the estimated coefficients (and their univariate Wald statistic) depends on the used parameterization, but the LRT does not. best, vito ----- Original Message ----- From: "ZABALZA-MEZGHANI Isabelle" <Isabelle.zabalza-mezghani at IFP.fr> To: "help R (E-mail)" <r-help at stat.math.ethz.ch> Cc: "JOURDAN Astrid" <Astrid.JOURDAN at IFP.fr> Sent: Thursday, November 07, 2002 9:55 AM Subject: [R] Qualitative factors> Hi, > I have some doubt about how qualitative factors are coded in R. For > instance, I consider a response y, a quantitative factor x and aqualitative> factor m at 3 levels, generated as follow : > > y_c(6,4,2.3,5,3.5,4,1.,8.5,4.3,5.6,2.3,4.1,2.5,8.4,7.4) > x_c(3,1,3,1,2,1,4,5,1,3,4,2,5,4,3) > m_gl(3,5) > > lm(y~x+m) > > Coefficients: > (Intercept) x m2 m3 > 3.96364 0.09818 0.44145 0.62291 > > In literature, 2 usual implicit coding process are suggested : m1=0 or > m1=-m2-m3. Does R use one of these process ? (I've already read the R > documentation on this topic, and it is still not clear in my mind). > Furthermore, how can I make prediction using this model, I mean how shouldI> specify the new data (especially for m). > > Thanks in advance for your help > > Regards, > > Isabelle Zabalza-Mezghani > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-> r-help mailing list -- Readhttp://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html> Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._. _._ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear Isabelle, Contrast coding in R for factors and ordered factors is governed by the contrasts option: > options("contrasts") $contrasts unordered ordered "contr.treatment" "contr.poly" The default for an unordered factor is "treatment" (0/1) coding, using the first level as the baseline: > contrasts(m) 2 3 1 0 0 2 1 0 3 0 1 > options("contrasts") $contrasts unordered ordered "contr.treatment" "contr.poly" See ?contr.treatment for details and alternatives. I hope that this helps, John At 09:55 AM 11/7/2002 +0100, you wrote:>Hi, >I have some doubt about how qualitative factors are coded in R. For >instance, I consider a response y, a quantitative factor x and a qualitative >factor m at 3 levels, generated as follow : > >y_c(6,4,2.3,5,3.5,4,1.,8.5,4.3,5.6,2.3,4.1,2.5,8.4,7.4) >x_c(3,1,3,1,2,1,4,5,1,3,4,2,5,4,3) >m_gl(3,5) > >lm(y~x+m) > >Coefficients: >(Intercept) x m2 m3 > 3.96364 0.09818 0.44145 0.62291 > >In literature, 2 usual implicit coding process are suggested : m1=0 or >m1=-m2-m3. Does R use one of these process ? (I've already read the R >documentation on this topic, and it is still not clear in my mind). >Furthermore, how can I make prediction using this model, I mean how should I >specify the new data (especially for m).----------------------------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: jfox at mcmaster.ca phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox ----------------------------------------------------- -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._